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Regulation of Gene Expression Using Single-Chain, Monomelic, 
Ligand Dependent Polypeptide Switches 

5 Technical Field of the Invention 

The field of this invention is regulation of transcription. More 
particularly, the present invention pertains to polypeptides that can activate or 
repress transcription in a small molecule iigand-dependent manner. 

10 Background of the Invention 

Designed transcription factors with defined target specificity and 
regulatory function provide invaluable tools for basic and applied research, and 
for gene therapy. Accordingly, the design of sequence-specific DNA binding 
domains has been the subject of intense interest for the last two decades. Of the 

1 5 many classes of DNA binding proteins studied, the modular Cy «a -Hka zinc finger 
DNA binding motif has shown the most promise for the production of proteins 
with tailored DNA binding specificity. The novel architecture of this class of 
proteins provides for the rapid construction of gene-specific targeting devices. 
Polydactyl zinc finger proteins are most readily prepared by assembly of modular 

20 zinc finger domains recognizing predefined three-nucleotide sequences (See. e.g . r 
Segal, D. J., Dreier, B., Beerli, R. R., and Barbas, C. F., m (1999) Tmc VfaLStead. 
Set U&96, 2758-2763; Beerli, R. R., Segal, D, J., Dreier, B., and Barbas, C. P., 
HI (1998) <Proi 9{atCAauC ScL U&95, 14628-14633; and Beerli, R. R., Dreier, B., 
and Barbas, C F., m (2000) Troi 9&tC AcmL ScL 1495-1500). Polydactyl 

25 proteins can be assembled using variable numbers of zinc finger domains of 

varied specificity providing DNA binding proteins that not only recognize novel 
sequences but also sequences of varied length. By combining six zinc finger 
domains, proteins have been produced that recognize 18 contiguous base pairs of 
DNA sequence, a DNA address sufficiently complex to specify any locus in the 4 

30 billion-base pair human genome (or any other genome). Ftision of polydactyl zinc 
finger proteins of this type to activation or repression domains provides 
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transcription factors that efficiently and specifically modulate the expression of 
both transgenes and endogenous genes (Beerli, R. R., Segal, D. J., Dreier, B., and 
Barbas, C F., HI (1998) Troc VfetLAcalScL ttW95, 14628-14633; and Beerli, R. 
R., Dreier, B., and Barbas, C. F., Ill (2000) &vc OfatLStead. ScL USA97, 1495- 
5 1500). 

While the availability of designed transcription factors with tailored DNA 
binding specificities provides novel opportunities in transcriptional regulation, 
additional applications would be available to ligand-dependent transcription 
factors. Designer zinc finger proteins dependent on small molecule inducers 

1 0 would have a number of applications, both for the regulation of endogenous 

genes, and for the development of inducible expression systems for the regulation 
of transgenes. Natural transcription factors are regulated by a number of different 
mechanisms, including postradiational modification such as phosphorylation 
(Janknecht, R., and Hunter, T. (1997) TMBOJM, 1620-1627; Darnell, J. E. f Jr. 

15 (1997) SdenceZn, 1630-1635), or by ligand binding. The prototype ligand- 
activated transcription factors are members of the nuclear hormone receptor 
family, including the receptors for sex steroids or adrenocorticoids (Carson- 
Jurica, M. A., Schrader, W. T., and OMalley, W. (1990) 'Endocrine Otgviewll, 
201-220; Evans, R. M (1988) Sctence2A0, 889-895). TTiese receptors are held 

20 inactive in the absence of hormone, by association with a number of inactivating 
factors including hsp90 (Pratt, W. B., and Toft, D. O, (1997) Endocrinefyv. 18, 
306-360). Upon ligand binding, nuclear hormone receptors dissociate from the 
inactivating complex, dimerize, and become able to bind DNA and activate 
transcription (Carson-Jurica, M. A,, Schrader, W. T„ and OMalley, W. (1990) 

25 Endocrine^zviews 11, 201-220; Evans, R. M. (1988) SdencelAti, 889-89512-14; 

and Pratt, W. B., and Toft, D. O. (1997) Endocrine %ev. 18, 306-360), Significantly, 
not only hormone binding but also inactivation and dimerization functions reside 
within the ligand binding domain (LBD) of these proteins (Beato, M. (1989) CeK 
56, 335-344). This fact has been exploited experimentally and steroid hormone 

30 receptor LBDs have found wide use as tools to render heterologous proteins 
hormone-dependent 
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In particular, the estrogen receptor (ER) LBD has been used to render the 
functions of c-Myc (Eilers, M., Heard, D., Yamamoto, K. R., and Bishop, J. M. 

(1989) <Katun340, 66-68), c-Fos (Superti-Furga, G., Bergers, G., Picard, D., and 
Busslinger, M. (1991) <Proi tfotLAcad ScL 1WS&, 51 14-51 18), and even the 

5 cytoplasmic kinase c-Raf (. Samuels, M. L., Weber, M. J., Bishop, J. M., and 
McMahon, M. (1993) 9doL Cett. Wet 13, 6241-6252) honnone-dependent To 
develop an inducible expression system for use in basic research and gene 
therapy, the availability of ligand-dependent transcriptional regulators is a 
prerequisite. Preferentially, these regulators would be activated by a small 

10 molecule inducer with no other biological activity, bind specific sequences 

present only in the target promoter, and have low immunogenicity . A number of 
ligand-regulated artificial transcription factors have been generated by various 
means, using functional domains derived from either prokaryotes (Gossen, M, 
and Bujard, H. (1992) tPrvc' 9iatC Skad ScL U&S9, 5547-5551 20. Gossen, M., 

15 Freundlieb, Bender, G„ Miiller, G., Hillen, W., and Bujard, H. (1995) Science 
268, 1766-1769 21. Labow, M. A., Bairn, S. B., Shenk, T„ and Levine, A. J. 

(1990) 9doL CdL <BwC 10, 3343-3356 22. Baim, S. B., Labow, M. A., Levine, A. J., 
and Shenk, T. (1991) 2Vnc V&tCAcad. ScL ZLS&SS, 5072-5076) or eukaryotes 
(Christopherson, K. S., Mark, M. R., Bajaj, V., and Godowski, P. J. (1992) Proc 

20 fJ^tt Head ScL 6314-6318 24. No, D., Yao, T.-P., and Evans, R. M. 

(1996) Troa 9fcttAmdScL USA93, 3346-3351 25. Wang, Y., OMalley, B. W, 
Jr., Tsai, S., and O'Malley, B. W. (1994) Trot Acad ScL UW91, 8180-8184 
Beerli etaC -35-26. Wang, Y., Xu, J., Pierson, T., O'Malley, B. W., and Tsai, S. 
Y. (1997) generUteraptf^ 432-441 27. Braselmann, S., Graninger, P., and 

25 Busslinger, M. (1993) Troc. TfaL Head ScL VSA90, 1657-1661 28. Louvion, J. F., 
Havaux-Copf, B., and Picard, D. (1993) genel31, 129-134 29. Rivera, V. M., 
Clackson, T., Natesan, S., Pollock, R., Amara, J. F., Keenan, T., Magari, S. R., 
Phillips, T., Courage, N. L., Cerasoli, R, Jr., Holt, D. A^ and Oilman, M. (1996) 
Saturn. Med. 2, 1028-1032). 

30 Of the functional domains derived from eukaryotic proteins, nuclear 

hormone receptor LBDs have been the most widely used In particular, regulators 
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based on the Gal4 DNA binding domain (DBD) fused to a human ER 
(Braselmann, S., Graninger, P., and Busslinger, M. (1993) Trot 9{atL Acad. Set 
<USA9% 16574661; Louvion, J. F., Havaux-Copf, B., and Picard, D. (1993) gem 
131, 129-134 ) or progesterone receptor (PR) LBD; (Wang, Y., OMalley, B. W., 
5 Jr., Tsai, S., and OMalley, B. W. (1994) Trot *fatt Acad ScL USA.91, 8180-8184; 
Wang, Y., Xu, I, Hereon, T., O'Malley, B. W„ and Tsai, S. Y. (1997) gene 
Hierupi/4, 432-441), as well as the ecdysone-inducible system based on the 
lDwcjr#& ecdysone receptor (EcR) and the mammalian retinoid X receptor (RXR) 
(Christopherson, K. S., Marie, M. R., Bajaj, V., and Godowski, P. J. (1992) Troc 

10 VfatL Acad. ScL VSAS9, 6314-6318; No, D., Yao, T.-P., and Evans, R. M. (1996) 
Troc ${atL Acad ScL USA.93, 3346-3351) have been described. Compared to the 
heterodimeric EcRTRXR system, regulators based on the ER and PR LBDs have 
the important advantage that they function as homodimers and require the 
delivery of only one cDNA. However, while ecdysone has no known biological 

1 5 effect on mammalian cells, estrogen and progesterone will elicit a biological 

response in cells or tissues that express the endogenous steroid receptors. With the 
availability of a mutated ER and a truncated PR LBDs that have lost 
responsiveness to their natural ligands but not to synthetic antagonists such as 4- 
hydroxytamoxifen (4-OHT) (Littlewood, T. D., Hancock, D. C. f Danielian, P. S., 

20 Parker, M. G., and Evan, G. L (1995) 9&cC Acids !&s.23, 1686-1690 ) orRU486 
(Vegeto, E., Allan, G. R, Schrader, W. T., Tsai, M.-L, McDonnell, D. P., and 
O'Malley, B. W. (1992) (^/T69, 703-713), respectively, this is no longer of great 
concern. Thus, steroid hormone receptor LBD-based inducible expression systems 
can be developed that function independently of the endogenous steroid receptors. 

25 To date, this has been shown for the PR LBD through the development of an 
RU486-inducible expression system based on the Gal4 DBD (Wang, Y., 
OMalley, B. W., Jr., Tsai, S., and O'Malley, B. W. (1994) Troc 9fatC Acad, ScL 
USA91, 8 180-8184; Wang, Y., Xu, J., Pierson, T., O'Malley, B. W M and Tsai, S. 
Y. (1997) gate IhcrajyA, 432-441). An inducible expression system based on a 

30 point-mutated (G525R) ER LBD (Littlewood, T. D., Hancock, D. C, Danielian, 
P. S., Parker, M. G., and Evan, G. L (1995) 9&cC Acids <Res. 23, 1686-1690) that 
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has lost the responsiveness to estrogen but not the antagonist 4-OHT has not been 
described to date. Designed zinc finger proteins have a number of advantages as 
compared to other DBDs, including the one derived from Gal4, since the ability to 
engineer DNA binding specificities allows ligand-dependent regulators to be 
5 directed to any desired artificial or natural promoter. Here we explore the utility 
of fusion proteins between designed zinc finger proteins and nuclear hormone 
receptor LBDs for the inducible control of gene expression. 

Brief Summary of the Invention 

10 In one embodiment, the present invention provides a non-naturally 

occurring polypeptide that contains two ligand binding domains qperatively 
linked to each other and a first functional domain operatively linked to one of the 
ligand binding domains. The ligand binding domains are preferably covalently 
linked to each other. More preferably, the two binding domains are covalently 

15 linked by means of a peptide linker that contains from about 10 to about 40 amino 
acid residues, preferably from about 15 to about 35 amino acid residues and, more 
preferably from about 18 to about 30 amino acid residues. 

In one embodiment, the ligand binding domains are derived from nuclear 
hormone receptors. The ligand binding domains can be derived from the same or 

20 different nuclear hormone receptors. Exemplary and preferred nuclear hormone 
receptors are steroid hormone receptors such as an estrogen receptor, a 
progesterone receptor, an ecdysone receptor and a retinoid X receptor. 

The first functional domain can be any domain that alters the function or 
activity of a target nucleotide, In one embodiment, the first functional domain is a 

25 nucleotide binding domain. Preferably, the nucleotide binding domain is a DNA 
binding domain. The DNA binding domain preferably contains at least one zinc 
finger DNA binding motif, more preferably from two to twelve zinc finger DNA 
binding motifs and, even more preferably from three to six zinc finger DNA 
binding motifs. In one embodiment, the zinc finger DNA binding motifs 

30 specifically bind to a nucleotide sequence of the formula (GNN)i-6. where O is 
guanidine and N is any nucleotide. In another embodiment, the first functional 
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domain is a transcriptional regulating domain such as a transcription activation 
domain or a transcription repression domain. 

In still another embodiment, the polypeptide gene switch contains a 
second functional domain. In accordance with this embodiment, a preferred first 
5 functional domain is a nucleotide binding domain and the second functional 
domain is a transcriptional regulating domain. 

In one embodiment, a polypeptide of this invention includes (a) a DNA 
binding domain having from three to six zinc finger DNA binding motifs; (b) a 
first ligand binding domain derived from a retinoid X receptor operative linked to 

1 0 the DNA binding domain, a second ligand binding domain derived from an 

ecdyzone receptor linked to the first ligand binding domain with a peptide spacer 
of from 18 to 36 amino acid residues; and (c) a transcription regulating domain 
operatively linked to fee second binding domain. 

In still another embodiment, a polypeptide gene switch includes (a) a 

15 DNA binding domain having from three to six zinc finger DNA binding motifs; 
(b) a first ligand binding domain derived from a progesterone receptor operatively 
linked to the DNA binding domain, a second ligand binding domain derived from 
a progesterone receptor linked to the fiist ligand binding domain with a peptide 
spacer of from 18 to 36 amino acid residues; and (c) a transcription regulating 

20 domain operatively linked to the second ligand binding domain. 

In another aspect, the present invention provides polynucleotides that 
encode a polypeptide gene switch of the invention, expression vectors containing 
such polynucleotides and cells containing such nucleotides. 

Another aspect of this invention provides a process of regulating the 

25 function of a target nucleotide that contains a defined sequence. The process 
includes the step of exposing the target nucleotide to a polypeptide of this 
invention in the presence of a ligand that binds at least one of the ligand binding 
domains of the polypeptide. In a related aspect, the present invention provides a 
process for regulating transcription (e.g., expression) of a target nucleotide (e.g., 

30 gene). In accordance with that process a target nucleotide that contains a defined 
sequence is exposed to a polypeptide of this invention in the presence of a ligand 
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that binds to at least one of the ligand binding domains of that polypeptide. The 
polypeptide contains a nucleotide binding domain that specifically binds to the 
defined sequence in the target nucleotide. Where the polypeptide gene switch 
contains a transcription repression domain, regulating is repression. Where die 
5 polypeptide gene switch contains a transcription activation domain, regulating is 
activation. 

Brief Description of the Drawings 

In the drawings lhat form a portion of the specification 
1 0 figure 1 shows generation of designed zinc finger proteins with novel 

DNA binding specificity. A, amino acid sequence of the three-finger proteins B3 
and Nl. DNA recognition helix positions -2 to 6, shown in bold print, were 
grafted into the framework of the three finger protein SplC. The location of the 
antiparallel (J sheets and the a helices, structural hallmarks of zinc finger 
1 5 domains, are as indicated. DNA binding specificity of each finger is show on the 
left. Fl-3, Finger 1-3. B, ELISA analysis of DNA binding specificity. Zinc finger 
proteins were expressed in £ coli as MBP fusions and purified. Specificity of 
binding was analyzed by measuring binding to immobilized biotinylated hahpin 
oligonucleotides containing the indicated 5'-(GNN)r3' sequences. Black bars, B3; 
20 gray bars, Nl . The maximal signals were normalized to 1 . The K D value for 

binding to the specific target sequence was measured by electrophoretic mobility 
shift assay and is labeled on top of the corresponding bars. 

Figure 2 shows regulation of gene expression by hormone-dependent, 
single-chain ER fusion constructs. A, structure of ER fusion proteins. E2C, six 
25 finger protein; L, flexible peptide linker. B, fusion proteins with a single ER-LBD 
bind as dimers. HeLa cells were cotransfected with a C7-ER-VP64 expression 
vector, and the indicated TATA luciferase reporter plasmids carrying either one or 
two C7 binding sites. 24 h after transfection, cells were either left untreated (-), or 
100 nM 4-OHT was added (+). Luciferase activity in total cell extracts was 
30 measured 48 h after transfection. Each bar represents the mean value (+/- SD) of 
duplicate measurements. C, control plasmid pcDNA3 that does not express a 



7 



WO 02/06463 



PCT/EPO 1/08190 



fusion protein. C, D; regulation of transcription through a single binding site by 
fusion proteins with two ER-LBDs. HeLa cells were cotransfected with the 
indicated expression vectors and the E2C-TATA-luciferase reporter plasmid, 
carrying a single E2C binding site upstream of a TATA box. 4-OHT induction 
5 and measurement of luciferase activity was carried out as described in B. 

Figure 3 shows regulation of gene expression by hormone-dependent, 
single-chain RXR/EcR fusion constructs. A structure of single-chain RXR/EcR 
fusion proteins. B, regulation of transcription through a single binding site. HeLa 
cells were cotransfected with the indicated expression vectors and the E2C- 
10 TATA-luciferase reporter plasmid, carrying a single E2C binding site. 24 h after 
transfection, cells were either left untreated (-), or 5 fiM Ponasterone A was added 
(+). Luciferase activity in total cell extracts was measured 48 h after transfection. 
Each bar represents the mean value (+/- SD) of duplicate measurements. 
pcDNA3.1, control plasmid that does not express a fusion protein. 
15 Figure 4 shows the nucleotide (SEQ ID NO: 3 1 ) and amino acid residue 

sequence (SEQ ID NO: 32) of zinc finger binding domain B3B. 

Figure 5 shows the nucleotide (SEQ ID NO: 33) and amino acid residue 
sequence (SEQ ID NO: 34) of zinc finger binding domain 2C7, 

figure 6 shows the nucleotide (SEQ ID NO: 35) and amino acid residue 
20 sequence (SEQ ID NO: 36) of zinc finger binding domain B3C2. 

Figure 7 shows the nucleotide (SEQ ID NO: 37) sequence of repression 
domain (KRAB-A) 2 . 

Figure 8 shows the nucleotide (SEQ ID NO: 38) sequence of repression 
domain (SID)2- 

25 Figure 9 shows the nucleotide (SEQ ID NO: 39) and amino acid residue 

sequence (SEQ ID NO: 40) of polypeptide E2C-ER-L-ER-VP64. 

Figure 10 shows the nucleotide (SEQ ID NO: 41) and amino acid residue 
sequence (SEQ ID NO: 42) of polypeptide E2C-ER-LL-ER-VP64. 
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Detailed Description of the Invention 
L The Invention 

The present invention provides polypeptide gene switches, 
5 polynucleotides that encode such polypeptides, expression vectors that contain 
such polynucleotides, cells that contain such expression vectors or 
polynucleotides and processes for regulating target nucleotide function using such 
polypeptides, polynucleotides and expression vectors. Unlike existing gene 
switches that contain a single ligand binding domain together with a DNA binding 
10 domain and/or a transcriptional regulating domain, polypeptide gene switches of 
the present invention contain two ligand binding domains. Upon binding of the 
ligand, an intramolecular configuration change occurs that allows for alignment of 
the functional domains to the target gene of interest. An advantage of the present 
gene switches, therefore, over existing gene switches is the need for only a single 
15 molecular switch and a single expression vector for production of that switch, 
n. Polypeptides 

A polypeptide gene switch of the present invention includes at least three 
components: two ligand binding domains (LBDs) and a first functional domain 
(FD-1). The ligand binding domains are operatively linked to the first functional 
20 domain such that the polypeptide, in the presence of a defined ligand that binds to 
at least one of the ligand binding domains, can alter the function of nucleotide. 
The domains can be arranged in any order. As shown below, the ligand binding 
domains can be situated in either the amino-or carboxyl-tenninal direction from 
die first functional domain. 

25 



LBDs 



FD-1 



FD-1 



LBDs 



A polypeptide of this invention is non-naturally occurring. As used 
herein, the term "non-naturally occurring" means, for example, one or more of the 
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following: (a) a peptide comprised of a non-naturally occurring amino acid 
sequence; (b) a peptide having a non-naturally occurring secondary structure not 
associated with the peptide as it occurs in nature; (c) a peptide which includes one 
or more amino acids not normally associated with the species of organism in 
5 which that peptide occurs in nature; (d) a peptide which includes a stereoisomer 
of one or more of the amino acids comprising the peptide, which stereoisomer is 
not associated with the peptide as it occurs in nature; (e) a peptide which includes 
one or more chemical moieties other than one of the natural amino acids; or (f) an 
isolated portion of a naturally occulting amino acid sequence (e.g., a truncated 

1 0 sequence). A polypeptide of this invention exists in an isolated form and purified 
to be substantially free of contaminating substances. A polypeptide is synthetic in 
nature. That is, the polypeptide is isolated and purified from natural sources or 
made de novo using techniques well known in the art. 
A. Ligand Binding Domain (LBD) 

15 Each LBD is an amino acid residue sequence that is capable of and binds a 

particular ligand. Binding of the ligand to the LBD alters the 
conformation/function of the polypeptide and allows for regulating a function of a 
target nucleotide. In die absence of ligand, the gene switch does not work to alter 
nucleotide function. At least one of the LBDs is capable of binding and binds a 

20 particular ligand. Both LBDs can bind a particular ligand. Thus, the LBDs can be 
the same or different. Preferred LBDs are derived from nuclear hormone 
receptors such as steroid hormone receptors. 

Exemplary and preferred steroid receptors that can serve as the source of 
ligand binding domains include the estrogen receptor (ER), progesterone receptor 

25 (PR), glucocorticoid-cc receptor, glucocorticoid-^ receptor, mineralocorticoid 
receptor, androgen receptor^ thyroid hormone receptor, retinoic acid receptor 
(RAR), retinoid X receptor (RXR), Vitamin D receptor, COUP-IP receptor, 
ecdysone receptor (EcR), Nuir-1 receptor and orphan receptors. A preferred EcR 
is derived either from Drosophila meUmogaster (DE) or Bombyx (BE). 

30 As is well known in the art, steroid hormone are composed of a DNA 

binding domain and a ligand binding domain. The DNA binding domain contains 
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the receptor regulating sequence and binds DNA and the ligand binding domain 
binds the specific biological compound (ligand) to activate the receptor. The term 
••ligand" refers to any compound which activates the receptor, usually by 
interaction with (binding) the ligand binding domain of the receptor. However, 
5 ligands also include compounds that activate the receptor without binding. Where 
used in a polypeptide gene switch of the present invention, it is preferred that the 
ligand receptor domain be modified from its naturally occurring ligand, a ligand 
other than the naturally occurring ligand (e.g. steroid honnone). Means of altering 
or derivatizing naturally occurring nuclear hormone receptor ligand binding 

10 domains to alter the binding specificity are well known in the art (See, e.g. United 
States Patent Nos. 5,874,534 and 5,599,904 the disclosures of which are 
incorporated herein by reference). Similarly, means for altering the estrogen 
receptor to change its bind affinity have reported fSee, e.g , Littlewood et al., 
Nucleic Acids Res., 3(10):1686-1690,1995]. 

1 5 The term "naturally occurring ligand" refers to compounds that are 

normally not found in animals or humans and which bind to the ligand binding 
domain of a receptor. The ligand can also be a "non-native ligand", a ligand that 
is not naturally found in the specific organism (man or animal) in which gene 
therapy is contemplated. For example, certain insect hormones such as ecdysone 

20 are not found in humans. This is an example of a non-native honnone to the 
animal or human. 

Examples of non-natural ligands, anti-hormones and non-native ligands 
include the following: lip<4-dime*ylammoph 

4,9-e stradiene-3-one (Ru38486 or Mifepestone); 1 lp-(4-din^thylaminophenyl> 

25 na-hydrexy-170-(3-hydroxyp^ 

(ZK98299 or Onapristone); llp-(4-acetylphenyl)-17p-hydroxy-17a-<l-propinyI)- 
4,9-estradiene-3-one (ZK112993); llp-(4-dimenthylaminophenyl)-17P-hydn>xy- 
17a-(3-hydroxy-l (Z)-propenyl-estra-4,9-diene-3-one (ZK98734); (7piipi7P)- 
lH4^iimethylaminophenyl>7^Ilethyl-4 , ^ , dihydrospiroy'ester-4,9-diene- 

30 17,2X3*H)-furan!-3-one (Qrg31806); (1 ip,14p,17a)-4',5'-^ydro-ll-(4- 

dimefliylammop^ (Org31376); 
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5-alpha-pregnane-3,2-dione. Additional non-natural ligands include, in general, 
synthetic non-steroidal estrogenic or anti-estrogenic compounds, broadly defined 
as selective estrogen receptor modulators (SERMS). Exemplary compounds 
include, but are not limited to, tamoxifen and raloxifen, 
5 Exemplary and preferred ligands for use with various ligand binding 

domains are (i) EcR: Ponasterone a > Muristerone A, GS-E (Invitrogen), 
Tebufenocide; (2) ER: estrogen antagonists such as 4-hydroxy-tamoxifen, ICI 
164384, RU 54876, Raloxifene; and (3) PR: progesterone antagonists such as RU 
486, RU 38486, and Onapristone. 

10 An especially preferred LBD derived from a progesterone receptor 

comprises amino acid residues 645-914 from the human progesterone receptor. 
An exemplary LBD derived from an estrogen receptor comprises amino acid 
residues 282-599 from the mouse G225R mutant 

The two LBDs are separated be an amino acid residue sequence linker that 

15 contains from about 10 to about 50 amino acid residues. Preferably, the spacer 
contains from about 15 to about 40 amino acid residues and, more preferably, 
from about 18 to about 35 amino acid residues. Exemplary and preferred spacers 
contain 18 (L), 30 (LL), or 36 (LLL) amino acid residues. 
B. Functional Domains 

20 A second component of a present polypeptide is a functional domain. As 

used herein, the term "functional domain" and it's grammatical equivalents, 
means an amino acid residue sequence that binds to, alter the structure of, and/or 
alters the function of, a nucleotide. Exemplary such functional domains include 
nucleotide binding domains, transcriptional regulating domains (e.g. transcription 

25 activation domains and transcription repression domains) and domains having 
nuclease activity. Such domains are well known in the art. 
1. Nucleotide Binding Domains 
A functional domain of a polypeptide can be a nucleotide binding domain: 
a sequence of amino acid residues that recognize and bind to a defined nucleotide 

30 sequence. The target nucleotide sequence can be an RNA sequence or, 

preferably, aDNA sequence. Amino acid residue sequences that recognize and 
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bind to defined DNA sequences are well known in the art (e.g., GAL4). Any such 
DNA binding peptide can be used as a DNA binding domain of a polypeptide 
gene switch of this invention. It is preferred, however, that the DNA binding 
domain of a present gene switch be one or more DNA binding zinc finger motifs. 
5 Such zinc finger DNA binding motifs are well known in the art (See, e.g .. PCT 
Patent Application Nos. W095/19421 and WO 98/5431 1, the disclosures of 
which are incorporated herein by reference). A DNA binding domain of a 
polypeptide gene switch of this invention, thus, preferably includes a multiple 
finger, polydactyl, zinc finger peptide that is designed to bind specific nucleotide 

10 target sequences. 

The present disclosure is based on the recognition of the structural features 
unique to the CysrHis2 zinc finger domain consist of a simple jJJJa fold of 
approximately 30 amino acids in length. Structural stability of this fold is 
achieved by hydrophobic interactions and by chelation of a single zinc ion by die 

15 conserved Cys2-ffis 2 residues (Lee, M.S., Gippert, GP„ Soman, K.V., Case, D.A. 
& Wright, P.E. (1989) Science 245, 635-637). Nucleic acid recognition is 
achieved through specific amino acid side chain contacts originating from the a- 
helix of the domain, which typically binds three base pairs of DNA sequence 
(Pavletich, N. P. & Pabo, CO. (1991) Science 252, 809-17, Elrod-Brickson, M., 

20 Rould, M.A., Nekludova, L. & Pabo, CO. (1996; Structure 4, 1171-1180). 

Unlike other nucleic acid recognition motifs, simple covalent linkage of multiple 
zinc finger domains allows die recognition of extended asymmetric sequences of 
DNA. 

Studies of natural zinc finger proteins have shown that three zinc finger 
25 domains can bind 9 bp of contiguous DNA sequence (Pavletich, NJP. & Pabo, 
CO. (1991) Science 252, 809-17., Swirnoff, AJL & Milbrandt, J. (1995) MoL 
Cell Biol 15, 2275-87). Whereas recognition of 9 bp of sequence is insufficient 
to specify a unique site within even the small genome of Exoli, polydactyl 
proteins containing six zinc fingers domains can specify 18-bp recognition (liu, 
30 Q., Segal, D J., Ghiara, IB. & Barbas HI, CF. (1997) Proc. Natl Acad. ScL USA 
94, 5525-5530). With respect to the development of a universal system for gene 
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control, and 18-bp address can be sufficient to specify a single site within all 
known genomes. And their efficacy in gene activation and repression within 
living human cells has recently been shown (Liu, Q., Segal, DJ., Ghiara, J.B. & 
Barbas m, CP. (1997) Proc. Natl. Acad. Sci. USA 94, 5525-5530). 
5 The zinc finger-nucleotide binding peptide domain can be derived or 

produced from a wild type zinc finger protein by truncation or expansion, or as a 
variant of the wild type-derived polypeptide by a process of site directed 
mutagenesis, or by a combination of the procedures. The term "truncated" refers 
to a zinc finger-nucleotide binding polypeptide that contains less that the foil 
10 number of zinc fingers found in the native zinc finger binding protein or that has 
been deleted of non-desired sequences. For example, truncation of the zinc finger- 
nucleotide binding protein TFIHA, which naturally contains nine zinc fingers, 
might be a polypeptide wifli only zinc fingers one through three. Expansion refers 
to a zinc finger polypeptide to which additional zinc finger modules have been 
15 added. For example, TFHIA may be extended to 12 fingers by adding 3 zinc 
finger modules from more than one wild type polypeptide, thus resulting in a 
"hybrid" zinc finger-nucleotide binding polypeptide. 

The term "mutagenized" refers to a zinc finger derived-nucleotide binding 
polypeptide that has been obtained by performing any of the known methods for 
20 accomplishing random or site-directed mutagenesis of the DNA encoding 

proteins. For instance, in TFHIA, mutagenesis can be preformed to replace non- 
conserved residues in one or more of the repeats of the consensus sequence. 
Truncated zinc finger-nucleotide binding proteins can also be mutagenized. 
Examples of known zinc finger-nucleotide binding proteins can also be 
mutagenized. Examples of known zinc finger-nucleotide binding polypeptides 
that can be truncated, expanded, and/or mutagenized according to the present 
invention in onler to inhibit the function of a nucleotide sequence containing a 
zinc finger-nucleotide binding motif includes TFIIIA and zif268. Other zinc 
finger-nucleotide binding proteins will be known to those of skill in the art 
A zinc finger DNA binding domain can be make using a variety of 
standard techniques well known in the art. Phage display libraries of zinc finger 



25 



30 
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proteins were created and selected under conditions that favored enrichment of 
sequence specific proteins. Zinc finger domains recognizing a number of 
sequences required refinement by site-directed mutagenesis that was guided by 
both phage selection data and structural information. 
5 A DNA binding domain used in a polypeptide of this invention is 

preferably a zinc finger-nucleotide binding peptide that binds to a (GNN)m 
nucleotide sequence. Zinc fingers that bind specifically to (GNN)i-e have been 
described in United States Patent Application Serial Number 09/173,941, filed 
October 16,1998 (the disclosure of which is incorporated herein by reference). 

1 0 Exemplary and preferred zinc finger DNA binding domains are designated 

herein as E2C, C7, B3B, 2C7, B3C2 and Nl. A detailed description of the 
preparation of polypeptide gene switches containing zinc finger DNA binding 
domains can be found hereinafter in the Examples. The amino acid residue and 
encoding nucleotide sequences for B3B, 2C7 and B3C2 are shown in EtGs. 4-6, 

15 respectively. 

2. Transcription Regulating Domains 
A transcription regulating domain refers to a peptide, which acts to 
activate or repress transcription of a target nucleotide (e.g., gene). Transcriptional 
activation domains are well known in the art (See, e.g .. Seipel et al, (1992) 

20 EMBO 11 :4961-4968). Exemplary and preferred transcription activation 
domains include VP16, TA2, VP64, STAT6, relA, TAF-1, TAF-2, TAU-1 and 
TAU-2. Especially preferred activation domains for use in the present invention 
are VP16 and VP64. Means for linking VP16 and VP64 to ligand binding 
domains are set forth hereinafter in the Examples. 

25 Transcriptional repressor domains are also well known in the art. 

Exemplary and preferred such transcriptional repressors are ERD, KRAB, SID, 
histone deacetylase, DNA, methylase, and derivatives, multimers and 
combinations thereof such as KRAB-ERD, SID-ERD, (KRABfc, (KRAB)j, 
KRAB-A, (KRAB-A)2, (SID)2, (KRAB-A)-SID and SID-(KRAB-A). A first 

30 repressor domain can be prepared using the Emppel-gssociated box (KRAB) 
domain (Margolin et al, 1994). This repressor domain is commonly found at die 
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N-terminus of zinc finger proteins and presumably exerts its repressive activity on 
TATA-dependent transcription in a distance- and orientation-independent manner, 
by interacting with the RING finger protein KAP-1 . One can utilize the KRAB 
domain found between amino acids 1 and 97 of the zinc finger protein KOX1 . 
5 Finally, to explore the utility of histone deacetylation for repression, amino acids 
1 to 36 of the Mad m£IN2 interaction domain (SID) can be fused to another 
domain (Ayer et aL, 1996). This small domain is found at die N-terminus of the 
transcription factor Mad and is responsible for mediating its transcriptional 
repression by interacting with mSIN3, which in turn interacts (he compressor N- 

1 0 CoR and with the histone deacetylase mRPDl . 

The amino acid residue and nucleotide encoding sequences of preferred 
transcriptional repression domains (KRAB-A>2 and (SID)* are shown in FIGs 7 
and 8, respectively. Means for linking repression domains to ligand binding 
domains as well as exemplary polypeptide gene switches containing repression 

1 5 domains are set forth hereinafter in the Examples. 

3. Polypeptide Qene Switches 
A polypeptide of this invention, in one embodiment, comprises two 
ligand binding domains and a first functional domain. In another embodiment, a 
polypeptide gene switch comprises two ligand binding domains, a first functional 

20 domain and a second functional domain. These domains can exist in any order as 
shown below. 

In a preferred embodiment the two ligand binding domains (LBDs) are 
located directly adjacent to one another, ie. they are "serially connected" within 
the monomelic polypeptide gene switch of the invention and are not separated by 

25 a functional domain of the invention. The serially connected LBDs may be 
separated from one another by a linker molecule, such as for example a 
polypeptide linker molecule. 

In a preferred embodiment die two USDs are located between two 
functional domains (FDs) of the invention, wherein one functional domain is a 

30 Transcription Regulating Domain (TRD) and the other functional domain is a 
Nucleotide Binding Domain (NBD). 
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In one particularly preferred embodiment tbe monomelic polypeptide gene 
switch of the invention consists of two FDs and two LBDs in the sequential orier 
FD-1 / LBD-1 / LBD-2 / FD-2. Prefenedly, in this embodiment, one 
functional domain is a TRD and the other functional domain is a NBD. 

Prefenedly, the NBD employed in the monomelic polypeptide gene 
switch of the invention includes 6 zinc finger binding motifs. As further described 
in file examples hereinbelow, a 6 zinc finger NBD employed in a monomelic 
polypeptide gene switch allows for the recognition of a unique 18bp nucleic acid 
sequence, which may be symmetric or asymmetric. 



LBDs 


FD-1 


FD-2 




FD-1 


FD-2 


LBDs 




FD-1 


LBDs 


FD-2 








FD-2 


LBDs 


FD-1 








FD-2 


FD-1 


LBDs 



A wide variety of polypeptide gene switches have been made. Exemplary 
such gene switches include (see above for definition of terms): 

Gene Switches Using RXR. E2C. and Activation Domains 
15 E2C-RXR-L-DE-VP64, E2C-RXR-LL-DE-VP64, E2C-RXR-LLL-DE- 

VP64, E2C-RXR-L-BE-VP64, E2C-RXR-LL-BE-VP64, E2C-RXR-LLL-BE- 
VP64, E2C-RXR-L-DE-VP1 6, E2C-RXR-LL-DE-VP 1 6, E2C-RXR-LLL-DE- 
VP16, E2C-RXR-L-BE-VP1 6, E2C-RXR-LL-BE-VP16, E2C-RXR-LLL-BE- 
VP16; 

20 Gene Switches Usfaif RTR- 2 C7. and Activation Domains 

2C7-RXR-L-DE-VP64, 2C7-RXR-LL-DE-VP64, 2C7-RXR-LLL-DE- 
VP64, 2C7-RXR-L-BE-VP64, 2C7-RXR-LL-BE-VP64, 2C7 -RXR-LLL-BE- 
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VP64, 2C7-RXR-L-DE-VP16, 2C7-RXR-LL-DE-VP16, 2C7 -RXR-LLL-DE- 
VP16, 2C7-RXR-L-BE-VP16, 2C7-RXR-LL-B E-VP 1 6, E2C-RXR-LLL-BE- 
VP16; 

Gene Switches Using RXR. B3B. and Activation Domains 
5 B 3 B -RXR-L-DE-VP64, B3B-RXR-LL-DE-VP64, B3B-RXR-LLL-DE- 

VP64, B3B 7-RXR-L-BE-VP64, B3B 7-RXR-LL-BE-VP64.B3B-RXR-LLL-BE- 
VP64, B3B-RXR-L-DE-VP16, B3B-RXR-LL-DE-VP16, B3B-RXR-LLL-DE- 
VP16, B3B-RXR-L-BE-VP16, B3B-RXR-LL-BE-VP1 6, B3B-RXR-LLL-BE- 
VP16; 

10 Gene Switches Using RXR. B3C2. and Activation Domains 

B3C2-RXR-L-DE-VP64, B3C2-RXR-LL-DE-VP64, B3C2-RXR-LLL- 
DE-VP64, B 3C2-RXR-L-BE-VP64, B3C2-RXR-LL-BE-VP64, B3C2-RXR- 
LLL-BE-VP64, B3C2-RXR-I^-DE-VP16, B3C2-RXR-LL-DE-VP16, B3C2- 
RXR-LLL-DE-VP16, B3C2-RXR-L-BE-VP16, B3C2 B-RXR-LL-BE-VP16, 

15 B3C2-RXR-LLL-BE-VP16; 

Gepe Switches Using RXR, E2C, and Repression. Domains 
E2C-RXR-L-DE-(KRAB-A)2, E2C-RXR-LL-DE-(KRAB-A)2, E2C- 
RXR-LLL~DE-(KRAB-A)2, E2C-RXR-L-BE-<KRAB-A)2, E2C-RXR-LL-BE- 
(KRAB-A)2, E2C-RXR-LLL-BE-(KRAB-A)2, E2C-RXR-L-DE-(KRAB-A)2, 

20 E2C-RXR-LL-DE-(KRAB-A)2, E2C-RXR-LLL-DE-(KRAB-A)2, E2C-RXR-L- 
BE-(KRAB-A)2, E2C-RXR-LL-BE-(KRAB-A)2, E2C-RXR-LLL-BE-(KRAB- 
A)2, E2C-RXR-L-DE-{SID)2, E2C-RXR-LL-DE-(SID)2, E2C-RXR-LLL-DE- 
(SE>)2, E2C-RXR-L-BE-(SID)2, E2C-RXR-LL-BE-(SED)2, E2C-RXR-LLL-BE- 
(SE>)2, E2C-RXR-L-DE-<SID)2, E2C-RXR-LL-DE-(SID)2, E2C-RXR-LLL-DE- 

25 (SID)2, E2C-RXR-L-BE-(SID)2, E2C-RXR-LL-BE-(SID)2, E2C-RXR-LLL-BE- 
(SID)2; 

Gene Switches Using RXR. 2C7. and Repression Domains 
2C7-RXR-L-DE-(KRAB-A)2, 2C7-RXR-IJL-DE-(KRAB-A)2, 2C7-RXR- 

LLL-DE-(KRAB-A)2, 2C7-RXR-I^BE-(KRAB-A)2, 2C7-RXR-LL-BE-(KRAB- 
30 A)2, 2C7-RXR-LLL-BE-(KRAB-A)2, 2C7-RXR-L~DE-(KR AB -A)2, 2C7-RXR- 

LL-DE-(KRAB-A)2, 2C7-RXR-LLL-DE-(KRAB-A)2, 2C7-RXR-L-BE-(KRAB- 
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A)2,2C7-RXR-LL-BE-(KRAB-A)2,E2C-RXR-IUL-BE-(KRAB-A)2, 2C7- 
RXR-L-DE-(SID)2, 2C7-RXR-LL-DE-(SID)2, 2C7-RXR-LLL-DE-(SID)2, 2C7- 
RXR-L-BE-(SID)2, 2C7-RXR-LL-BE-(Sn>)2, 2C7-RXR-LLL-BE-(SID)2, 2C7- 
RXR-L-DE-(SE))2, 2C7-RXR-LL-DE-(SE>)2, 2C7-RXR-LLL-DE-(SID)2, 2C7- 
5 RXR-L-BE-(SE>)2, 2C7-RXR-LL-BE-(S1D)2 > E2C-RXR-LLL-BE-(SID)2,n; 
Gene Switches Using RXR. B3B. and Repression Dn^s 
B3B-RXR-L-DE-(KRAB-A)2, B3B-RXR-LL-DE-(KRAB-A)2, B3B- 
RXR-IXL-DE-(KRAB-A)2, B3B 7-RXR-L-BE-(KRAB-A)2, B3B 7-RXR-LL- 
BE-(KRAB-A)2, B3B-RXR-LLL-BE-(KRAB-A)2, B3B-RXR-L-DE-(KRAB- 
10 A)2, B3B-RXR-LL-DE-(KRAB-A)2, B3B-RXR-LLL-DE-(KRAB-A)2, B3B- 
RXR-L-BE-(KRAB-A)2, B3B-RXR-LL-BE-(KRAB-A)2, B3B-RXR-LLL-BE- 
(KRAB-A)2, B3B-RXR-L-DE-(S1D)2, B3B-RXR-LI^DE-(SID)2, B3B-RXR- 
LLL-DE-(SID)2, B3B 7-RXR-L-BE-(SID)2, B3B 7-RXR-LL-BE-{SID)2, B3B- 
RXR-LLL-BE-(SID)2, B3B-RXR-L-DE-(SID)2, B3B-RXR-LL-DE-(SID)2, 
15 B3B-RXR-LLL-DE-(SID)2, B3B-RXR-L-BE-(SID)2, B3B-RXR-LL-BE-(SID)2, 
B3B-RXR-LLL-BE-(SID)2; 

Gene Switches Using RXR. B3C2. and Repres sion Domains 
B3C2-RXR-L-DE-(KRAB-A)2, B3C2-RXR-LL-DE-(KRAB-A)2, B3C2- 
RXR-LLL-DE-(KRAB-A)2, B3C2-RXR-L-BE-CKRAB-A)2, B3C2-RXR-LL- 
20 BE-(KRAB-A)2, B3C2-RXR-LLL-BE-(KRAB-A)2, B3C2-RXR-L-DE-(KRAB- 
A)2, B3C2-RXR-LL-DE-(KRAB-A)2, B3C2-RXR-LLL-DE-(KRAB-A)2, B3C2- 
RXR-L-BE-(KRAB-A)2, B3C2 B-RXR-LL-BE-(KRAB-A)2, B3C2-RXR-LLL- 
BE-(KRAB-A)2, B3C2-RXR-L-DE-(SID)2, B3C24tXR-LL-DE-(SiD)2, B3C2- 
RXR-LLL-DE-(SID)2, B3C2-RXR-L-BE-(SID)2, B3C2-RXR-LL-BE-(SID)2, 
25 B3C2-RXR-LIX-BE-(SE>)2, B3C2-RXR-L-DE-(SID)2, B 3 C2-RXR-LL-DE- 
(SE>)2, B3C2-RXR-LLL-DE-(SID)2, B3C2-RXR-L-BE-(SID)2, B3C2 B-RXR- 
LL-BE-(SID)2, B3C2-RXR-IJLL-BE-(SID)2; 

Gene Switches Usinp PR. E 2C. and Activation Domains 
E2C-PR-L-PR-VP64, E2C-PR-LL-PR-VP64, E2C-PR-1LL-PR-VP64, 
30 E2C-PR-L-PR-VP64, E2C-PR-LL-PR-VP64, E2C-PR-LLL-PR-VP64, E2C-PR- 
L-PR-VP16, E2C-PR-LL-PR-VP16, E2C-PR-LLL-PR-VP16, E2C-PR-L-PR- 
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VP16, E2C-PR-LL-PR-VP16, E2C-PR-LLL-PR-VP16; 

Gene Switches Using PR. 2C7. and Activation Domains 
2C7-PR-L-PR-VP64, 2C7-PR-LL-PR-VP64, 2C7 -PR-LLL-PR-VP64, 

2C7-PR-L-PR-VP64, 2C7-PR-LL-PR-VP64, 2C7-PR-LLL-PR-VP64, 2C7-PR-L- 
5 PR-VP16, 2C7-PR-LL-PR-VP16, 2C7-PR-LLL-PR-VP16, 2C7-PR-L-PR-VP16, 

2C7-PR-LL-PR-VP16, E2C-PR-LLJL-PR-VP16; 

Gene Switphes Using PR, B3P, and Activation Domains 
B3B-PR-L-PR-VP64, B3B-PR-LL-PR-VP64, B3B-PR-LLL-PR-VP64, 

B3B 7-PR-L-PR-VP64, B3B 7-PR-LL-PR-VP64, B3B-PR-LLL-PR-VP64, B3B- 
10 PR-L-PR-VP16, B3B-PR-LL-PR-VP16, B3B-PR4JLL-PR-VP16, B3B-PR-L-PR- 

VP16. B 3B-PR-LL-PR-VP1 6, B3B-PR-LLL-PR-VP16; 

Gene Switches Using PR. B3C2. and Acti vation Domains 
B3C2-PR-L-PR-VP64, B3C2-PR-LL-PR-VP64, B3C2-PR-LLL-PR- 

VP64, B 3 C2-PR-L-PR-VP64 B3 C2-PR-LL-PR- VP64, B3 C2-PR-LLL-PR-VP64, 
15 B3C2-PR-L-PR-VP16, B3C2-PR-LL-PR-VP16, B3C2-PR-LLL-PR-VP16, 

B3C2-PR-L-PR-VP16, B3C2 B-PR-LL-PR-VP16, B3C2-PR-LLL-PR-VP16; 
Gene Switches Using PR. E2C. and Repression Domains 
E2C-PR-L-PR-(KRAB-A)2, E2C-PR-LL-PR-(KRAB-A)2, E2C-FR-LLL- 

PR-(KRAB-A)2, E2C-PR-L-PR-(KRAB-A)2, E2C-PR-LL-PR-<KRAB-A)2, 
20 E2C-PR-LLL-PR-(KRAB-A)2, E2C-PR-L-PR-(KRAB-A)2, E2C-PR-LL-PR- 

(KRAB-A)2, E2C-PR-IJLL-PR-(KRAB-A)2, E2C-PR-L-PR-(KRAB.A)2, E2C- 

PR-LL-PR-(KRAB-A)2, E2C-PR-LLL-PR-(KRAB-A)2, E2C-PR-L-PR-(SID)2, 

E2C-PR-LL-PR-(SID)2, E2C-PR-LLL-PR-(SID)2, E2C-PR-L-PR-(SIDj2, E2C- 

PR-LL-PR-(SE))2, E2C-PR-LLL-PR-(SID)2 f E2C-PR-L-PR-(SID)2, E2C-PR- 
25 LL-PR-(SID)2, E2C-PR-LLL-PR-(SID)2, E2C-PR-L-PR-(SID)2, E2C-PR-LL- 

PR-(SID)2,E2C-PR-U J L-PR-(SID)2; 

Gene Switches Using PR. 2C7. and Rep ression Domains 
2C7-PR-L-PR-(KRAB-A)2, 2C7-PR-Ii-PR^KRAB-A)2, 2C7-PR-LLL- 

PR-(KRAB-A)2, 2C7-PR-L-PR-(KRAB-A)2, 2C7-PR-LL-PR-(KRAB-A)2, 2C7- 
30 PR-LIX-PRKKRAB-A)2,2C7-PR-L-PRKKR^ 

A)2, 2C7-PR-LLL-PR-OCRAB-A)2, 2C7-PR-L-PR-(KRAB-A)2, 2C7-PR-LL- 
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PR-(KRAB-A)2, E2C-PR-LLL-PR-(KRAB-A)2 > 2C7-PR-L-PR-<S1D)2, 2C7- 
PR-LL-PR-(SID)2, 2C7-PR-LLL-PR-(SID)2, 2C7-PR-L-PR-(SID)2, 2C7-PR-LL- 
PR-(SE>)2, 2C7-PR-LLL-PR-(SK>)2, 2C7-PR-L-PR-(SE>)2, 2C7-PR-LL-PR- 
(SID)2, 2C7-PR-LLL-PR-(SID)2, 2C7-PR-L-PR-(SID)2, 2C7-PR-LL-PR-(SID)2, 
5 E2C-PR-LLL-PR-(Sn))2,n; 

Gene Swi tches Using PR. B3B. and Repression Domains 
B3B-PR-L-PR-(KRAB-A)2, B3B-PR-LL-PR-(KRAB-A)2 > B3B-PR-LLL- 
PR-(KRAB-A)2, B3B 7-PR-L-PR-(KRAB-A)2, B3B 7-PR-LL-PR-(KRAB-A)2, 
B3B-PR-LLL-PR-(KRAB-A)2, B3B-PR-L-PR-(KRAB-A)2, B3B-PR-UL-PR- 
1 0 (KRAB- A)2, B3B-PR-LLL-PR-(KRAB-A)2, B3B-PR-L-PR-(KRAB-A)2, B3B- 
PR-LL-PR-<KRAB-A)2, B3B-PR-LLL-PR-(KRAB-A)2, B3B-PR-L-PR-(SID)2, 
B3B-PR-LL-PR-(SID)2, B3B-PR-LLL-PR-(S]D)2, B3B 7-PR-L-PR-(SH>)2, B3B 
7-PR-LL-PR-(SE>)2, B3B-PR-LLL-PR-(SID)2, BSB-PR-L-PR^SID)^ B3B-PR- 
LL-PR-(SID)2, B3B-PR-LLL-PR-(SID)2, B3B-PR-L-PR-(SID)2, B3B-PR-LL- 
1 5 PR-(SID)2, B 3B-PR-LLL-PR-(SID)2; 

Gene Switches Using PR. B3C2. and Repression Domains 
B3C2-PR-L-PR-(KRAB-A)2, B3C2-PR-LL-PR-(KRAB-A)2, B3C2-PR- 
LLL-PR-(KRAB-A)2, B3C2-PR-L-PR-(KRAB-A)2, B3C2-PR-LL-PR-(KRAB- 
A)2, B3C2-PR^LLL-PR-(KRAB-A)2, B3C2-PR-L-PR-(KRAB-A)2, B3C2-PR- 
20 LL-PR-(KRAB-A)2, B3C2-PR-LLL-PR-(KRAB-A)2, B3C2-PR-L-PR-(KRAB- 
A)2, B3C2 B-PR-LL-PR-(KRAB-A)2, B3C2-PRiLL-PR-(KRAB-A)2, B3C2- 
PR-L-PR-(SID)2, B3C2-PR-LL-PR-(SID)2, B3C2-PR-LLL-PR-(SID)2, B3C2- 
PR-I^-PR-(SID)2, B3C2-PR-LL-PR-(SE>)2, B3C2-PR-LLL-PR-(SID)2, B3C2- 
PR-L-PR-(SID)2, B3C2-PR-LL-PR-(SID)2, B3C2-PR-LLL-PR-(SID)2, B3C2- 
25 PR-L-PR-(SID)2, B3C2 B-PR-LL-PR-(SID)2, B3C2-PR LLL-PR-(SID)2; 
Gene Switches Using ER. E2C, and AgtiYflfon Domains, 
E2C-ER-L-ER-VP64, E2C-ER-LL-ER-VP64, E2C-ER-LLL-ER-VP64, 
E2C-ER-L-ER-VP64, E2C-ER-LL-ER-VP64, E2C-ER-LLL-ER-VP64, E2C-ER- 
L-ER-VP16, E2C-ER-LL-ER-VP16, E2C-ER-LLL-ER-VP1 6, E2C-ER-L-ER- 
30 VP16,E2C-ER-LL-ER-VP16,E2C-ER-LLL-ER-VP16; 

Gene Switches Using ER 2C7. and Activation Domains 
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2C7-ER-L-ER-VP64, 2C7-ER-LL-ER-VP64, 2C7-ER-LLL-ER-VP64, 
2C7-ER-L-ER-VP64, 2C7-ER-LL-ER-VP64, 2C7-ER-LLL-ER-VP64, 2C7-ER- 
L-ER-VP16, 2C7-ER-LL-ER-VP16, 2C7-ER-LLL-ER-YP16, 2C7-ER-L-ER- 
VP16, 2C7-ER-LL-ER-VP16, E2C-ER-LLL-ER-VP16; 
5 Gene Switches Using ER. B3B. and Activation Domains 

B3B-ER-L-ER-VP64, B 3B -ER-LL-ER- VP64 , B3B-ER-LLL-ER-VP64, 
B3B 7-ER-L-ER-VP64, B3B 7-ER-LL-ER-VP64, B3B-ER-LLL-ER-VP64, B3B- 
ER-L-ER-VP1 6, B3B-ER-LL-ER-VP16, B3B-ER-LLL-ER-VP16, B3B-ER-L- 
ER-VP16, B3B-ER-LL-ER-VP16, B3B-ER-LLL-ER-VP16; 
10 Gene Switches Using ER. B3C2. and Activation Domains 

B3C2-ER-L-ER-VP64, B3C2-ER-LL-ER-VP64, B3C2-ER-LLL-ER- 
VP64, B3C2-ER-L-ER-VP64, B3C2-ER-LL-ER-VP64, B3C2-ER-LLL-ER- 
VP64, B3C2-ER-L-ER-VP16, B3C2-ER-LL-ER-VP16, B3C2-ER-LLL-ER- 
VP16, B3C2-ER-L-ER-VP16, B3C2 B-ER-LL-ER-VP16, B3C2-ER-LLL-ER- 
15 VP16; 

Gene Switches Using ER. E2C. and Repression Domains 
E2C-ER-L-ER-(KRAB-A)2, E2C-ER-LL-ER-(KRAB-A)2, E2C-ER- 
LLL-ER-(KRAB-A)2, E2C-ER-L-ER-(KRAB-A)2 ) E2C-ER-LL-ER-(KRAB- 
A)2, E2C-ER-LLL-ER-(KRAB-A)2, E2C-ER-L-ER-(KRAB-A)2, E2C-ER-LL- 
20 ER-(KRAB-A)2, E2C-m-LIJL-ER-(KRAB-A)2, E2C-ER-L-ER-(KRAB-A)2, 
E2C-ER-LL-ER-(KRAB-A)2, E2C-ER-LLL-ER-(KRAB-A)2, E2C-ER-L-ER- 
(SID)2, E2C-ER-LL-ER-(SID)2, E2C-ER-LLL-ER-(SID)2, E2C-ER-L-ER- 
(SID)2, E2C-ER-LL-ER-(SID)2 > E2C-ER-LLL-ER-(SID)2, E2C-ER-L-ER- 
(SID)2, E2C-ER-LL-ER-(SID)2, E2C-ER-LLL-ER-(SID)2, E2C-ER-L-ER- 
25 (SID)2, E2C-ER-LL-ER-(SE))2, E2C-ER-LLL-ER-(SID)2; 

Gene Switches Using ER. 2C7. and Re pression Domains 
2C7-ER-L-ER-(KRAB-A)2, 2C7-ER-LL-ER-(KRAB.A)2, 2C7-ER-LLL- 
ER-(KRAB-A)2, 2C7-ER-L-ER-(KRAB-A)2, 2C7-ER-LL-ER-(KRAB-A)2, 
2C7-ER-LLL-ER-(KRAB-A)2 f 2C7-ER-L-ER-(KRAB-A)2, 2C7-ER-LL-ER- 
30 (KRAB-A)2, 2C7-ER-LLL-ER-(KRAB -A)2, 2C7-ER-L-ER-(KRAB - A)2, 2C7- 
ER-LL-ER-(KRAB-A)2, E2C-ER-LLL-ER-(KRAB-A)2, 2C7-ER-L-ER-(SID)2, 
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2C7-ER-LL-ER-(SID)2, 2C7-ER-LLL-ER-(SID)2 > 2C7-ER-L-ER-(SID)2, 2C7- 
ER-LL-ER-(SID)2, 2C7-ER-LLL-ER-(SID)2, 2C7-ER-L-ER-(SID)2, 2C7-ER- 
LL-ER-(SH>)2, 2C7-ER-LLL-ER-(SrD)2, 2C7-ER-L-ER-{SID)2, 2C7-ER-LL- 
ER-(SID)2, E2C-ER-LLL-ER-(SID)2 ( n; 
5 Gene Switches Using ER. B3B. and Repression Domains 

B3B-ER-L-ER-(KRAB-A)2, B3B-ER-LL-ER-<KRAB-A)2, B3B-ER- 
LLL-ER-<KRAB-A)2, B3B 7-ER-L-ER-(KRAB-A)2, B3B 7-ER-LL-ER-(KRAB - 
A)2, B3B-ER-LLL-ER-{KRAB-A)2, B3B-ER-L-ER-(KRAB-A)2, B3B-ER-LL- 
ER-(KRAB-A)2, B3B-ER-LLL-ER-(KRAB-A)2, B3B-ER-L-ER-(KRAB-A)2, 

10 B3B-ER-LL-ER-(KRAB-A)2, B3B-ER-LLL-ER-(KRAB-A)2, B3B-ER-L-ER- 
(SID)2, B3B-ER-LL-ER-(SE))2, B3B-ER-LLL-ER-(SID)2, B3B 7-ER-L-ER- 
(SID)2, B3B 7-ER-LL-ER-(SID)2, B3B-ER-LLL-ER-(SID)2, B3B-ER-L-ER- 
(SID)2, B3B-ER-LL-ER-(SID)2, B3B-ER-LLL-ER-(SID)2, B3B-ER-L-ER- 
(SID)2, B3B-ER-LL-ER-(Sn>)2, B3B-ER-LLL-ER-<SID)2; 

IS Gene Switch es Using ER. B3C2. and Repression Domains 

B3C2-ER-L-ER-(KRAB-A)2, B3C2-ER-Ii-ER-(KRAB^A)2, B3C2-ER- 
LLL-ER-OntAB-A^, B3C2-ER-L-ER-(KRAB-A)2, B3C2-ER-LL-ER-(KRAB- 
A)2, B3C2-ER-LLL-ER-(iaiAB.A)2, B3C2-ER-L-ER-(KRAB-A)2, B3C2-ER- 
LL-ER-(KRAB-A)2, B3C2-ER-LLL-ER-(KRAB-A)2, B3C2-ER-L-ER-(KRAB- 

20 A)2, B3C2 B-ER-LL-ER-(KRAB-A)2, B3C2-ER-LLL-ER-(KRAB-A)2, B3C2- 
ER-L-ER-(SID)2, B3C2-ER-LL-ER-(SE))2, B3C2-ER-LLL-ER-(SID)2, B3C2- 
ER-L-ER-(SID)2, B3C2-ER-LL-ER-(SID)2, B3C2-ER-LLL-ER-(SE))2, B3C2- 
ER-L-ER-(SID)2, B3C2-ER-LL-ER-(SK>)2, B3C2-ER-LLL-ER-(SID)2, B3C2- 
ER-L-ER-(SID)2, B3C2 B-ER-LL-ER-(SID)2, B3C2-ER-LLL-ER-(SID)2. 

25 The nuckotide (SEQ ID NO: 39) and amino acid residue sequence (SEQ 

ID NO: 40) of polypeptide E2C-ER-L-ER-VP64 are shown in FIG. 9. The 
nucleotide (SEQ ID NO: 41) and amino acid residue sequence (SEQ ID NO: 42) 
of polypeptide E2C-ER-LL-ER-VP64 are shown in Fig. 10. 

30 ffl. Polynucleotides. Expression Vectors and Host Cells 

In a related aspect, the present invention provides polynucleotides mat 
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encode a polypeptide gene switch of this invention, expression vectors containing 
those polynucleotides, cells containing those polynucleotides and transformed 
cells containing those expression vectors. Vectors of primary utility for gene 
therapy include, but are not limited to human adenovirus vectors, adeno- 
5 associated vectors, murine or lend vims derived retroviral vectors, or a variety of 
non-viral compositions including liposomes, polymers, and other DNA containing 
conjugates. Such vector systems can be used o deliver the gene switches either in 
vitro or in vfvo, depending on the vector system. With adenovirus, for instance, 
vectors can be administered intravenously to transduce the liver and other organs, 
10 introduced directly into the lung, or into vascular compartments temporarily 
localized by ligation or other methods. Methods for constructing such vectors, 
and methods and uses for the described invention are known to those skilled in the 
field of gene therapy, 

15 IV. Methods of Regulating Nucleotide Function 

The present invention further provides a process for regulating the 
expression of a desired nucleotide sequence such as a gene. In accordance with 
the process, the target nucleotide sequence is exposed to an effective amount of a 
gene switch and a ligand, wherein the nucleotide binding domain of the gene 

20 switch binds to a portion of the target nucleotide and wherein the ligand binds to 
at least one of the ligand binding domains of the gene switch. Exposure can occur 
in vitro, in situ or in vivo. The term "effective amount* f means that amount that 
regulates transcription of a nucleotide (e.g. structural gene or translation of RNA). 
The term "regulating" refers to the suppression, enhancement, or induction 

25 of a Junction. For example, a polypeptide of the invention may modulate a 

promoter sequence by binding to a motif within the promoter, thereby enhancing 
or suppressing transcription of a gene operatively linked to the promoter 
nucleotide sequence. Alternatively, modulation may include inhibition of a gene 
where the polypeptide binds to the structural gene and blocks DNA dependent 

30 RNA polymerase from reading through the gene, thus inhibiting transcription of 
the gene. Alternatively, modulation may include inhibition of translation of a 
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transcript. 

The promoter region of a gene includes the regulatory elements that 
typically lie 5* to a structural gene. If a gene is to be activated, proteins known as 
transcription factors attach to the promoter region of the gene. This assembly 
5 resembles an "on switch" by enabling an enzyme to transcribe a second genetic 
segment from DNA to RNA. In most cases the resulting RNA molecule serves as 
a template for synthesis of a specific protein; sometimes RNA itself is the final 
product. 

Regulation of gene expression or transcription can be accomplished both 
10 by exposing the target gene to a polypeptide switch of this invention or, 

preferably by transforming a cell that contains the target gene with an expression 
vector that contains a polynucleotide sequence that encodes a gene switch. 

The Examples that follow illustrate particular embodiments of die present 
invention and are not limiting of the specification or claims in any way. 

15 

EXAMPLE 1: General Methods 

Construction of zinc finger proteins. For the construction of the B3 and 
Nl zinc finger proteins, DNA recognition helices from the Zif268 Finger 2 
variants pmGAA, pmGAC, pmGGA, pmGGG, and pGTA were utilized [Segal, 

20 D. I., Dreier, B., Beerli, R. R., and Barbas, C F., HI (1999) Trot <X?tL Steal $cL 

2758-2763]. Three finger proteins binding the respective 9-bp target-sites 
were constructed by grafting the appropriate DNA recognition helices into the 
framework of the three finger protein SplC [Desjarlais, J. R„ and Berg, J. M. 
(1993) Trot 9&C food. Scu US&90, 2256-2260]; DNA fragments encoding the 

25 two 3 finger proteins were assembled from 6 overlapping oligonucleotides as 

described [Beerli, R. R., Segal, D. J. t Dreier, B., and Barbas, C R, m (1998) Trot. 
thfatC Acad. ScL 14628-14633], The three finger protein coding regions 

were then cloned into the bacterial expression vector pMal-CSS via Sfil 
digestion. 

30 Protein purification. Moltose binding protein (MBP) fusion proteins 

were purified to >90% homogeneity using the Protein Fusion and Purification 
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System (New England Biolabs), except (hat Zinc Buffer A (ZBA; 10 mM Tris, 
pH7 J/90 mM KC1, 1 mM MgClz, 90 \xM ZnCh)/l% BSA/5 mM EOT) was used 
as the column buffer. Protein purity and concentration were determined from 
Coomassie blue-stained 15% SDS-PAGE gels by comparison to BSA standards, 

5 ELISA analysis. In 96-well ELBA plates, 0.2 \ig of streptavidin (Pierce) 

was applied to each well for 1 hour at 37°C, then washed twice with water. 
Biotinylated target oligonucleotide (0.025 jig) was applied in the same manner. 
ZBA/3% BSA was applied for blocking, but the wells were not washed after 
incubation. All subsequent incubations were at room temperature. Starting with 

10 2 |ig purified MBP fusion protein in the top wells, 2-fbld serial dilutions were 
applied in lx binding buffer (ZBA/1% BSA/5 mM DTT/0.12 \ig/\il sheared 
herring sperm DNA). The samples were incubated 1 hour at room temperature, 
followed by 10 washes with water. Mouse anti-maltose binding protein mAb 
(Sigma) in ZBA/1% BSA was applied to the wells for 30 minutes, followed by 10 

1 5 washes with water. Goat anti -mouse IgG mAb conjugated to alkaline 

phosphatase (Sigma) was applied to the wells for 30 minutes, followed by 10 
washes with water. Alkaline phosphatase substrate (Sigma) was applied, and the 
OD405 was quantitated with SOFTmax 235 (Molecular Devices). 

Gel mobility shift assays. Target oligonucleotides were labeled at their 3' 
20 ends with [ 32 P] and gel purified. Eleven 3-fold serial dilutions of protein were 
incubated in 20 [il binding reactions (lx Binding Buffei/10% glycerol/^1 pM 
target oligonucleotide) for three hours at room temperature, then resolved on a 5% 
polyacrlyamide gel in 0.5x TBE buffer. Quantitation of dried gels was performed 
using aPhosphorlmager and ImageQuant software (Molecular Dynamics), and 
25 the K D was determined by Scatchaid analysis. 

Reporter constructs for determining the optimal spacing and 
orientation of the two half-sites. C7 dimer-TATA fragments were generated by 
PCR amplification with C7 dimer-TATA primers (5'-GAG GGT ACC GCGTGG 
GCG A0-5 GCGTGG GCG AGT CGA CTC TAG AGG GT A TAT AAT GG-3 1 
30 (SEQ ID NO: 1) for direct repeats; 5'-GAG GGT ACC GCGTGG GCG A*. 5 
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CGC CCA C<3C AGT CGA CTC TAG AGG GTA TAT AAT (SEQ ID 
NO: 2) for inverted repeats; 5-GAG GGT ACC CQCCCA CGC Am GCGTGG 
QCG AGT CGA CTC TAG AGG GTA TAT AAT 00-3' (SEQ ID NO: 3) for 
everted repeats) and GLprimer2 (5'-CTT TAT GTT TTT GGC GTC TTC C-3* 
5 (SEQ ID NO: 4); Promega), using P 17x4TATA-luc (gift from S. Y. Tsai) as a 
template. PCR products were cloned into pGL3-Basic (Promega) via digestion 
with the restriction endonucleases Kpnl and Ncol. 

RU486- and Tamoxifen-inducible promoter constructs. 10xC7-TATA, 
10xB3-TATA, and lOxNl-TATA fragments were assembled from two pairs of 

10 complementary oligonucleotides each and cloned into Sacl-Xmal linearized 
pGL3-Basic (Promega), upstream of the firefly luciferase coding region, creating 
the plasmids 10xC7-TATA-luc, 10xB3-TATA-luc, and lOxNl-TATA-luc. To 
generate the lOxNl-TATA-lacZ reporter construct, the lacZ coding region was 
excised from ppgal-Basic (Clontech) and used to replace the luciferase coding 

15 region of lOxNl-TATA-luc viaHind3-BamHl digestion. 

Luciferase and p-gal reporter assays. For all transfections, HeLa cells 
were plated in 24-well dishes and used at a confluency of 40-60%. For luciferase 
reporter assays, 175 ng reporter plasmid (promoter constructs in pGL3 or, as 
negative control, pGL3-Basic) and 25 ng effector plasmid (zinc finger-steroid 

20 receptor fusions in pcDNA3 or, as negative control, empty pcDNA3) were 

transfected using die Iipofectamine reagent (Gibco BRL). After approximately 24 
h, expression was induced by the addition of lOnM RU486 (Biomol), 100 nM 4- 
OHT (Sigma), or 5 mM Ponasterone A (Ihvitrogen), Cell extracts were prepared 
approximately 48 hours after transaction and assayed for luciferase activity using 

25 the Promega luciferase assay reagent in a MicroLumat LB96P luminometer 
(EG&G Berthold, Gaithersburg, MD). For dual reporter assays, 85ng luciferase 
reporter plasmid, 85ngb-gal reporter plasmid, and 15ngof each of the two 
effector plasmids were transfected. b-gal activity was measured using the 
luminescent b-galactosidase detection kit II (Clontech). 
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Zinc finger-steroid receptor fusion constructs with N-terminal 
effector domains. The VP16 coding region was PCR amplified from 
pcDNA3/C7-VP16 using me primers VPNhe-F (5'-GAG GAG GAG GAG GCT 
AGC GCC ACC ATG GGG CGC GCC GGC GCT CCC CCG ACC GAT GTC 
5 AGC CTG-3') (SEQ ID NO: 5), and VPHind-B (5'-GAG GAG GAG GAG AAG 
CTT GTT AAT TAA ACC GTA CTC GTC AAT TCC AAG GGC ATC G-3 ') 
(SEQ ID NO: 6) or VPNLSIDnd-B (5'-GAG GAG GAG GAG AAG CTT AAC 
TTT GCGTTT CTTTTT CGG GTT AAT TAA ACC GTA CTC GTC AAT 
TCC AAG GGC ATC G-3') (SEQ ID NO: 7). The C7 coding region was 

10 amplified from the same plasmid, using the primers C7Hind-F (5'-GAG GAG 
GAG GAG AAG CTT GGG GCC ACG GCG GCC CTC GAG CCC TAT GC- 
3') (SEQ ID NO: 8), and C7Bam-B (5'-GAG GAG GGA TCC CCC TGG CCG 
GCC TGG CCA CTA GTT CTA GAG TC-3') (SEQ ID NO: 9) or C7NLSBam- 
B (5'-GAG GAG GGA TCC CCA ACT TTG CGT TTC TTT TTC GGC TGG 

15 CCG GCC TGG CCA CTA GTT CTA GAG TC-3') (SEQ ID NO: 10). The 
human PR truncated LBD (aa645-914) was amplified from 
PAPCMVGL914VPc'-SV [Wang, Y., Xu, J., Pierson, T., O'Malley, B. W., and 
Tsai, S. Y. (1997) gemVierapy^ 432-441] using the primers PRBam-F (5'-GAG 
GAG GAG GAG GGA TCC AGT CAG AGT TGT GAG AGC ACT GGA TGC 

20 TG-3') (SEQ ID NO: 11) and PREco-B (5'-GAG GAG GAA TTC TCA AGC 
AAT AAC TTC AGA CAT CAT TTC TGG AAA TTC-3') (SEQ ID NO: 12). 
The VP16-C7-PR, VP16-NLS-C7-PR, and VP16-C7-NLS-PR coding regions 
were then assembled inpcDNA3.1(+)Zeo (mvitrogen) using the Nhel,Hind3, 
BamHl, and EcoRl restriction sites incorporated in the PCR primers. In the 

25 resulting constructs, the C7 coding regions were flanked by two Sfil sites, and the 
VP16 coding regions by Ascl and Pacl sites. These restriction sites were 
introduced to facilitate the exchange of DBDs and effector domains, respectively. 

To generate the VP16-C7-ER, VP1 6-NLS-C7-ER, and VP16-C7-NLS-ER 
constructs, the point-mutated murine ER LBD coding region (aa28 1-599, G525R) 

30 was excised from pBabe/Myc-ER [Littlewood, T. D., Hancock, D. C, Danielian, 
P. S., Parker, M. G., and Evan, G. L (1995) XycL Acids Xgs.23, 1686-1690], and 

28 



WO 02/06463 



PCT/EP01/08190 



used to replace the PR LBD coding region via BamHl-EcoRl restriction 
digestion. 

To generate fusion constructs with B3 or Nl DBDs, C7 was replaced by 
the B3 or Nl coding regions via Sfil digestion. Fusion constructs containing a 
5 VP64 effector domain were produced by replacing VP16 by the VP64 coding 
region via Ascl-Pacl digestion. 

Zinc finger-steroid receptor fusion constructs with C-terminal 
effector domains. The truncated human PR LBD was amplified from 
PAPCMVGL914VPc'-SV [Wang, Y., Xu, J., Pierson, T., O'Malley, B. W„ and 

10 Tsai, S. Y. (1997) gene Vfierapyi, 432-441] using the primers PRFse-F (5'- GAG 
GAG GAG GAG GAG GGC CGG CCG CGT CGA CCA GGT CAG AGT TGT 
GAG AGC ACT GGA TGC-30 (SEQ ID NO: 13) and PRAsc-B (5'- GAG GAG 
GAG GAG GAG GGC GCG CCC CGT CGA CCC AGC AAT AAC TTC AGA 
CAT CAT TTC TGG-30 (SEQ ID NO: 14). The point-mutated mouse ER LBD 

1 5 was amplified from pBabe/Myc-ER [Litdewood, T. D., Hancock, D. C, 

Danielian, P. S., Parker, M. G., and Evan, G. I. (1995) 7fycL!Ad£sfR$s. 23, 1686- 
1690] using the primers ERFse-F (5'- GAG GAG GAG GAG GAG GGC CGG 
CCG CCG AAA TGA AAT GGG TGC TTC AGG AGA C-3 5 ) (SEQ ID NO: 15) 
and ERAsc-B (5'- GAG GAG GAG GAG GAG GGC GCG CCC GAT CGT GTT 

20 GGG GAA GCC CTC TGC TIC-V) (SEQ ID NO: 16). The resulting PCR 

products were then inserted into pcDNA3/E2C-VP16 [Beerli, R. R., Segal, D. J., 
Dreier, B., and Barbas, C. F., m (1998) Tree 9&L AauL ScL £t£ft95, 14628- 
14633], in between die E2C and VP16 coding regions, via digestion with die 
restriction endonucleases Fsel and Ascl . 

25 To generate fusion constructs with B3 or Nl DBDs, E2C was replaced by 

the B3 or Nl coding regions via Sfil digestion. Rision constructs containing a 
VP64 effector domain were produced by replacing VP16 by the VP64 coding 
region via Ascl-Pacl digestion. 

Heterodimeric switch constructs. For construction of the E2C-ER 
30 fusion, the point-mutated mouse ER LBD was amplified from pBabe/Myc-ER ' 
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[Littlewood,.T. D., Hancock, D. C, Danielian, P. S., Parker, M. G., and Evan, G. 
L (1995) tytcL Acids %§s. 23, 1686-1690] using the primers ERFse-F and ERPac-B 
(5'-GAG GAG GAG GAG GAG TTA ATT AAG ATC GTG TTG GGG AAG 
CCC TCT OCT TC-3 1 ) (SEQ ID NO: 17). The PGR product was then inserted 
5 into the construct pcDNA3/E2C-VP64, replacing the VP64 coding region, via 
Fsel-Pacl digestion. To generate the ER-VP64 fusion, die ER LBD was 
amplified using the primers ERATGBam-F (5'-GAG GAG GAG GAG GGA 
TCC GCC ACC ATG CGA AAT GAA ATG GGT GCT TCA GGA GAC-30 
(SEQ ID NO: 18) and ERAsc-B . The PGR product was then inserted into 

10 pcDNA3/E2C-VP64, [Beerli, R. R., Segal, D. J., Dreier, B., and Barbas, C. R, HI 
(1998) Troc T&L&auL ScL ZLS&95, 14628-14633] replacing the E2C coding 
region, via BamHl-Ascl digestion. 

Single-chain switch constructs. For construction of single-chain fusions 
with two ER LBDs, the point-mutated mouse ER LBD was amplified from 

1 5 pBabe/Myc-ER [Littlewood, T. D., Hancock, D. C, Danielian, P. S., Parker, M. 
G., and Evan, G. L (1995) iMpcC Adds $gs. 23, 1686-1690] either using the primers 
ERFse-F and ERSpel-B (S'-GAG GAG GAG GAG GAG GAG ACT AGT GGA 
ACC ACC CCC ACC ACC GCC CGA GCC ACC GCC ACC AGA GGA GAT 
CGT GTT GGG GAA GCC CTC TGC-3 1 ) (SEQ ID NO: 1 9), or using the primers 

20 ERNhel-Fl (for 1 8aa linker construct; 5M3AG GAG GAG GAG GAG GAG 
GCT AGC GGC GGT GGC GGT GGC TCC TCT GGT GGC GGT GGC GGT 
TCT TCC AAT GAA ATG GGT GCT TCA GGA GAC-30 (SEQ ID NO: 20) or 
ERNhel-F2 (for 30aa linker construct; 5*- GAG GAG GAG GAG GAG GAG 
GCT AGC TCT TCC AAT GAA ATG GGT GCT TCA GGA GAC S 9 ) (SEQ ID 

25 NO: 21), and ERAsc-B. The PCR products were then digested with, respectively, 
Fsel and Spel, or Nhel and Ascl, and inserted into Fsel-Ascl linearized 
pcDNA3/E2C-VP64 [Beerli, R. R., Segal, D. J., Dreier, B., and Barbas, C. R, III 
(1998) Troc fifaCSbad: Scu USPL9S, 14628-14633], 

For construction of RXR-EcR single-chain fusions, the ligand binding 

30 domain of the human retinoid X receptor (hRXRoc, aa373-654) was PCR 
amplified from pVgRXR (Invitrogen) using the primers RXRFse-F (5 '-GAG 
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GAG GAG GGC CGG CCG GGA AGC CGT GCA GGA GGA GCG GC-3') 
(SEQ ID NO: 22) and RXRSpe-B (5'-GAG GAG GAG GAG GAG ACT AGT 
GGA ACC ACC CCC ACC ACC GCC CGA GCC ACC GCC ACC AGA GGA 
AGT CAT TTG GTG CGG CGC CTC CAG C-3 1 ) (SEQ ID NO: 23). The ligand 
5 binding domain of the ecdysone receptor (EcR, aa202-462, drosophUa 

melanogaster) was PCR amplified from pVgRXR using the primers EcRNhe-Fl 
(for 18aa linker construct; 5 -GAG GAG GAG GAG GCT AGC TCT TCC GGT 
GGC GGC CAA GAC TTT GTT AAG AAG G-3") (SEQ ID NO: 24), or 
EcRNhe-F2 (for 30aa linker construct; 5'-GAG GAG GAG GAG GCT AGC 
GGC GGT GGC GGT GGC TCC TCT GGT GGC GGT GGC GGT TCT TCC 
GGT GGC GGC CAA GAC TTT GTT AAG AAG G-30 (SEQ ID NO: 25), and 
EcRAsc-B (5'-GAG GAG GAG GGC GCG CCC GGC ATG AAC GTC CCA 
GAT CTC CTC GAG-3} (SEQ ID NO: 26). The PCR products were then 
digested with, respectively, Fsel and Spel, or Nhel and Ascl, and inserted into 
Fsel-Ascl linearized pcDNA3/E2C-VP64 [BeerU, R. R„ Segal, D. J., Dreier, B., 
and Barbas, G. F., m (1998) 2roc VfatC Skad. ScL ILSA95, 14628-14633], DNA 
binding domains were exchanged via Sfil digestidn, effector domains via Ascl- 
Pacl digestion. 

To generate the 36aa linker, E2C-RLLE-VP64 fusion construct, the RXR 
LBD was PCR amplified from pcDNA3/E2C-RE-VP64 using me primers 
RXRFse-F and RXRSpeLL-B (5'-GAG GAG GAG GAG GAG ACT AGT AGA 
GCC ACC GCC CCC TTC AGA ACC GCC CGA GCC ACC GCC ACC AGA 
GG-W (SEQ ID NO: 27). The EcR LBD was amplified from the same plasmid, 
using the primers EcRNheLL-F (5'-GAG GAG GAG GAG GCT AGC GGG 
GGT TCG GAG GGT GGC GGG TCT GAG GGT GGG GGT GGT TCC ACT 
AGC TCT TCC-3) (SEQ ID NO: 28) and EcRAsc-B. The PCR products were 
inserted into pcDNA3/E2C-VP64 as described above. 

EXAMPLE 2: Gene Switches 

Generation of hormone-regulated zinc finger-steroid receptor fusion 
proteins. Previous studies have shown the potential of engineered C2-H2 zinc 
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finger proteins for the regulation of target gene expression [. Liu, Q„ Segal, D. J., 
Ghiara, J. B., and Barbas, C F., m (1997) <2toc <HgtL Acad ScL USA94, 5525- 
5530; Kim, J. S., and Pabo, C. O. (1997) J<BioCCfiem272, 29795-29800; . Beerli, 
R. R., Segal, D. J., Dreier, B., and Barbas, C R, III (1998) Troc %fctC.Aaut.Scl 
5 WA95, 14628-14633; Beerii, R. R., Dreier, B„ and Barbas, C. F., IH (2000) 2roc 
9{atL&cad ScL USA97, 1495-1500]. However, to fully realize the potential of 
engineered zinc finger proteins, it is desirable that their otherwise constitutive 
DNA binding activity be rendered ligand-dependent. The ligand binding domains 
(LBDs) of the human progesterone receptor (hPR) and the murine estrogen 

1 0 receptor (mER) have previously been used for the regulation of heterologous 

proteins, after having been modified to lack binding to the natural hoimones while 
retaining binding to synthetic antagonists (Litdewood, T. D., Hancock, D. C, 
Danielian, P. S., Paricer, M. G„ and Evan, G. L (1995) 9^icL Acids iRes. 23, 1686- 
1690; Wang, Y., Xu, J., Herson, T., O'Malley, B. W., and Tsai, S. Y. (1997) gem 

15 Vforapt/4, 432-441]. Thus, the Zif268 variant C7 [Wu, H., Yang, W.-P., and 
Baibas, C. F., HI (1995) Troc 9iaiC Acad. ScL USA92, 344-348] was fused to a 
transcriptional activation domain plus the LBD of either of the two nuclear 
hormone receptors. The VP64-C7-PR fusion protein contains an N-terminal VP64 
activation domain [Beerli, R. R., Segal, D. J., Dreier, B., and Barbas, C. F., HI 

20 (1998) Ttoc IfatLAcad ScL U&95, 14628-14633], and a C-tenninal hPR LBD 
(aa645-914) lacking amino acids 915-933, responsive to the progesterone- 
antagonist RU486/Mifepristone but not to progesterone [Wang, Y., Xu, J., 
Pierson, T., O'Malley, B. W., and Tsai, S. Y. (1997) geneUierapyA, 432-441]. The 
VP64-C7-ER fusion protein contains a C-tenninal mER LBD (aa282-599) with a 

25 single amino acid substitution (G525R), and is responsive to the estrogen 

antagonist 4-hydroxy-tamoxifen (4-OHT) but not to estrogen [Littlewood, T. D., 
Hancock, D. C, Danielian, P. S., Parker, M. G., and Evan, G. I. (1995) 9&cL Adds 
<&s. 23, 1686-1690]. 

Determination of the optimal response element for zinc finger-steroid 

30 receptor fusion proteins. Naturally occurring steroid receptors bind DNA as 
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dimers and typically recognize response elements consisting of palindromic 
sequences [Evans, R. M. (1988) Sciatce2A0, 889-895; Carson-Jurica, M. A., 
Schrader, W. T., and OMalley, W. (1990) Tndoaine. ffytriewsll, 201-220]. 
Moreover, it was demonstrated that in some cases also direct repeats can serve as 
5 binding sites for receptor dimers [Aumais, J. P., Lee, H. S., DeGannes, C, 

Horsford, J., and White, J. H. (1996) J. Zu>L Cfiatu 271, 12568-12577]. Given this 
obvious flexibility in DNA recognition by naturally occurring receptor dimers, the 
optimal structure of a response element for an artificial, zinc finger based 
transcriptional switch was not known. However, to develop an efficient, hormone- 

10 inducible system for the regulation of target gene expression, a detailed 
knowledge of the binding site architecture is required. 

To determine the optimal orientation and spacing of the two half-sites of a 
response element for a zinc fiuger-LBD fusion protein, a series of reporter 
plasmids was constructed. Each contains two C7 binding sites upstream of a 

15 TATA box and a firefly luciferase coding region. The two C7 binding sites were 
introduced in different orientations (direct, inverted, or everted repeat) and with 
various spacings (no spacing or 1 to 5 bp spacing). Plasmids directing expression 
of VP64-C-PR or VP64-C7-ER fusion constructs were then co-transfected with 
the various reporter plasmids and assayed for hormone-induced luciferase 

20 expression. Significantly, each of the C7 dimer binding sites was able to act as a 
response element for both PR and ER based proteins, albeit at variing efficiency. 
In contrast, a reporter plasmid with a single C7 binding site was not activated, 
indicating that hormone-induced activation of transcription was mediated by 
dimers. 

25 Optimal spacing depended on the orientation of the two half-sites. In the 

case of the PR fusion protein, optimal spacing seemed to be at 2-3 bp for inverted 
repeats and 3 bp for everted repeats. Response elements consisting of direct 
repeats had no single optimal spacing; the best response was obtained with 4-5 bp, 
or no spacing at all. For the ER fusion protein, optimal spacing was at 3-4 bp for 

30 direct repeats, 1-2 bp for inverted repeats, and 3 bp for everted repeats . It should 
be noted that there were significant variations in the basal, i.e. ligand-independent 
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activity of PR and ER fusion proteins, depending on the response element tested. 
Most notably, increasing the spacing of direct repeats from 3 to 4 bp led to a 1.9- 
fold higher basal activity of VP64-C7-PR, and even a 3.7-fold increase in the case 
of VP64-C7-ER. High basal activity is extremely undesirable for an inducible 
5 promoter system, where tight control over the expression levels of a particular 
gene of interest is often required, especially if the gene product is toxic. Thus, in 
choosing appropriate response elements, particular attention must be paid not only 
to hormone inducibility but also to its basal activity in the presence of the 
regulatory protein. The response element consisting of direct repeats with a 

1 0 spacing of three nucleotides was considered to be a good choice for use in a 

hormone-inducible artificial promoter, since it was compatible with both PR and 
ER fusion proteins. Significantly, its basal acticity in the presence of either PR or 
ER fusion proteins was among the lowest of all response elements tested. 
Furthermore, good hormone induced activation of transcription was observed with 

1 5 both VP64-C7-PR (3.9-fold) and VP64-C7-ER (9.5-fold). 

Generation of novel DNA binding domains. While the use of the C7 
DNA binding domain was well suited for the preliminary studies described above, 
it may not be a good choice for incorporation into an inducible transcriptional 
regulator. The C7 protein is a variant of the mouse transcription factor Zif268 

20 [Pavletich, N. P., and Pabo, C. O. (1991) Sdence252, 809-817], with increased 
affinity but unchanged specificity [Wu, H., Yang, W.-P., and Barbas, C. R, III 
(1995) Twc. $&tC teat ScL USA 92, 344-348]. We reasoned that the use of 
alternate DNA binding domains would minimize potential pleiotropic effects of 
the chimeric regulators. Previously, we described a strategy for the rapid 

25 assembly of zinc finger proteins from a family of predefined zinc finger domains 
specific for each of the sixteen 5 '-GNN-3' DNA triplets [Beerii, R. R., Segal, D. 
J., Dreier, B. f and Barbas, C. R, HI (1998) Troc 9£atC AauC ScL U&95, 14628- 
14633; Segal, D. J., Dreier, B., Beerii, R. R., and Barbas, C. F., EDE (1999) Troc 
9£atC Stead. ScL ZIS&96, 2758-2763]. Three finger proteins binding any desired 5'- 

30 (GNN)3-3* sequence can be rapidly prepared by grafting the amino acid residues 
involved in base-specific DNA recognition into the framework of the consensus 
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three finger protein SplC [Desjarlais, J. R., and Berg, J. M. (1993) Twc %(atL 
Aai£ ScL UW90, 2256-2260]. To date, well over 100 three finger proteins have 
been produced in our laboratory. Two of these, B3 and Nl, were chosen to be 
used in inducible transcriptional regulators (Figure 1 A). The B3 and Nl proteins 
5 are designed to bind die sequences S'-GGAGGG GAC-3' or 5'-GGG GTA GAA- 
3', respectively. To verify their DNA binding specificity, these proteins were 
purified as MBP-fusions and tested by ELK A analysis using an arbitrary 
selection of oligonucleotides containing SHGNN^-S' sequences (Fig. IB). 
Significantly, both proteins recognized their target sequence and showed no 

10 crossreactivity to any of the other 5 -(GNN^-S* sequences tested. However, as 
judged by ELBA, binding of Nl was much weaker than binding of B3. 
Therefore, affinities were determined by electrophoretic mobility-shift analysis. 
The B3 protein bound its target sequence with a K D value of 15nM, similar to the 
Kd values we previously reported for other three finger proteins [Beerli, R, R„ 

15 Segal, D. J., Dreier, B., and Barbas, C. F, III (1998) Troc QfetCAauC ScL US&95, 
14628-14633]. In contrast, Nl affinity for its target was dramatically lower and 
we estimate its Kd value to be in the range of 5-10 pM. The fact that the two 
proteins had very different affinities for their respective target sequences was 
considered positive, since it allows to investigate the influence of affinity on the 

20 functionality of an inducible expression system. 

RU486- and 4-OHT-inducible systems for the control of gene 
expression. To allow for a comparative analysis, a series of RU486- or 4-OHT- 
inducible transcriptional regulators were constructed containing either the B3 or 
the Nl DNA binding domain. The role of placement of the activation domain was 

25 investigated, by fusing it either to the N- or the C-terminus of the protein. Two 
different activation domains were compared: the herpes simplex virus VP16 
transactivation domain [Sadowski, L, Ma, J., Triezenberg, S., and Ptashne, M. 
(1988) fl£tttns335, 563-564], and the synthetic VP64 activation domain, which 
consists of 4 tandem repeats of VP16's minimal activation domain [Beerli, R. 

30 Segal, D. J., Dreier, B., and Barbas, C. F., m (1998) Troc 9CatLAm£ScL WjWL95, 
14628-14633]. 
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Synthetic promoters were constructed based on the B3 and Nl DNA target 
sequences, and the optimal response element structure defined above. The 10xB3- 
TATA-luc and lOxNl-TATA-luc plasmids each contain five response elements, 
consisting of direct repeats spaced by three nucleotides, upstream of a TATA box 
5 and a firefly luciferase coding region. The response elements are separated from 
each other by six nucleotides, which should allow the concomitant binding of five 
dimers and thus maximize the promoter activity. Hie activity of the various 
fusion constructs was assessed by transient cotransfection studies with the cognate 
TATA reporter plasmids in HeLa cells (Table 1). 

10 

Table 1 





LBD=PR 


LBD=ER 


exp. 1 


exp. 2 


exp. 1 


exp. 2 


VP16-B3-LBD 


34x 


36x 


37x 


26x 


VP64-B3-LBD 


37x 


24x 


26x 


27x 


B3-LBD-VP16 


115x 


116x 


47x 


58x 


B3-LBD-VP64 


HOx 


85x 


62x 


99x 


VP16-N1-LBD 


188x 


159x 


lOlx 


39x 


VP64-N1-LBD 


206x 


390x 


49x 


58x 


N1-LBD-VP16 


282x 


203x 


24x 


30x 


N1-LBD-VP64 


151x 


129x 


1319x 


464x 



In general, the ER fusion proteins turned out to be the stronger 
15 transactivators, and 4-OHT-induced luciferase activity was usually 3 to 6 times 
higher than RU486-induced luciferase activity mediated by PR fusion proteins. 
However, since the basal, L e. ligand independent, activity of ER chimeras was 
often somewhat higher, their hormone-induced fold-stimulation was not generally 
better. Hormone-dependent gene activation in excess of 2 orders of magnitude 
20 was commonly observed with both PR and ER fusion proteins, values that are 
significantly better than what was previously reported for the GaM-PR fusion 
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protein GLVPc' [Wang, Y., Xu, J., Pierson, T., O'Malley, B. W., and Tsai, S. Y. 

(1997) gene<Ifcrapy4 9 432-441]. 

The placement of the activation domain had a significant influence on the 
activity of the chimeric regulators. However, favored placement was dependent 
5 on the nature of the activation domain. Whereas the VP16 domain yielded the 
more potent activators when placed at the C-terminus, the VP64 was more active 
at the N-terminus. Accordingly, direct comparisons showed that an N-terminal 
VP64 was more potent than a N-terminal VP 16 domain, and a C-terminal VP16 
was more potent than a C-terrninal VP64 domain. The nature and placement of 

10 the activation domain was also found to have an influence on the basal activity of 
the chimeric regulators. In particular, a relatively high basal activity was observed 
in the case of regulators with N-terminal VP64 domain. 

The nature of the DNA binding domain had a major influence on the 
extent of ligand-dependence of the chimeras. Use of the Nl protein as DNA 

1 5 binding domain led to more tightly regulated fusion constructs with significantly 
better fold-stimulation of promoter activities than the use of B3, likely due to the 
dramatic affinity differences between Nl and B3. In particular, the N1-ER-VP64 
regulator had no significant basal activity and was capable of mediating a 464- to 
1319-fold 4-OHT-induced activation of the lOxNl-TATA minimal promoter 

20 (Table 1). The extent of ligand-induced activation of gene expression over a range 
of 3 orders of magnitude is particularly remarkable, since it has thus far only been 
reported for die tetracycline controlled system of gene regulation [Gossen, M, 
and Bujard, H. (1992) Troc 9fott Slcad. ScL tt#l89, 5547-5551 ; Gossen, M., 
Freundlieb, S., Bender, G., Muller, G., Hillen, W., and Bujaixl, H.(1995) Science 

25 268, 1766-1769]. 

Concomitant regulation of multiple promoters. Zinc finger technology 
has made a large repertoire of DNA binding specificities available for use in 
protein engineering [Beerli, R. R., Segal, D. J., Dreier, B., and Barbas, C. R, m 

(1998) <Broc.<Xg*Laca<L ScL VSA95, 14628-14633; Segal, D. J., Dreier, B., Beerli, 
30 R. R., and Barbas, C. R, m (1999) Vroc ZfctCSlauC. 
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ScL USA96, 2758-2763; Beerli, R. R, Dreier, B., and Baibas, C. F., HI (2000) 
Troc.$l*tCAcatCScL 14954500]. The availability of different steroid 

hormone receptor-derived regulatory domains [Littlewood, T. D., Hancock, D. C, 
Danielian, P. S., Parker, M. G., and Evan, G. I. (1995) 9fy^Adds$£s. 23, 1686- 
5 1690; Wang, Y., Xu, J., Pierson, T., O'Malley, B. W., and Tsai, S. Y. (1997) gene 
Vfierajy4, 432-441], and the ability to redirect chimeric regulators to virtually any 
desired target sequence should make it possible to independently regulate the 
expression of multiple genes at the same time. To examine this possibility, a 
reporter plasmid was constructed directing expression of p-galactosidase (P-gal) 

1 0 under the control of the lOxNl -TATA minimal promoter. The chimeric regulators 
B3-PR-VP16 and N1-ER-VP64 were then transiently expressed in HeLa cells 
along with the 10xB3-TATA-luc and lOxNl-TATA-p-gal reporter plasmids. The 
transfected cells were treated with either RU486 or 4-OHT and the luciferase and 
P-gal activities were monitored. Significantly, RU486 induced expression of 

1 5 luciferase while having no effect on p-gal reporter gene activity. 4-OHT, on the 
other hand, did not affect luciferase expression but efficiently activated P-gal 
expression. These results demonstrate that the two regulatoi/promoter 
combinations act independently from one another, and that multiple genes can 
efficiently and independently regulated by the selective addition of the desired 

20 hormone. 

Development of a monomelic hormone-dependent gene-switch. The 
ability to engineer DNA binding proteins with desired specificities makes it 
possible to generate artificial transcription factors capable of imposing dominant 
regulatory effects on endogenous genes [Beerli, R. R., Dreier, B., and Barbas, C. 

25 F., m (2000) Trot. WaCAaut ScL UtfL97, 1495-1500]. For many applications of 
this technology it may be desirable that the effect on endogenous gene expression 
is reversible. The use of steroid hormone receptor LBDs has the potential to 
render regulation of endogenous gene expression reversible. However, one major 
drawback is the fact that steroid hormone receptors, as well as the chimeric 

30 regulators described herein, bind DNA as dimers. Thus, when the fusion protein 
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C7-ER-VP64 was transiently expressed in HeLa cells it was unable to regulate a 
reporter construct carrying a single C7 binding site, while it readily regulated a 
reporter that had two C7 binding sites and therefore accommodated binding of a 
dimer (Fig. 2B). An additional problem was encountered when the C7 DBD was 
5 replaced by E2C, which contains six zinc finger domains and recognizes the 1 8- 
bp sequence 5-GGG GCC GGA GCC GCA GTG-3*(SEQ ID NO; 29) in the 5'- 
UTR of the prbto-oncogene c-erbB-2 [Yamamoto, T., Ikawa, S., Akiyama, T., 
Semba, K., Nomura, N., Miyajima, N., Saito, T., and Toyoshima, K. (1986) 
*fature319, 230-234; Beerli, R. R., Segal, D. J M Dreier, B., and Barbas, C F„ III 
10 (1998) Ttoc TlatLAauLScL ^95, 14628-14633]. The E2C-ER-VP64 fusion 
protein was constitutively active on a reporter carrying a single E2C binding site, 
almost as active as an E2C-VP64 fusion without an ER LBD, and did not respond 
well to hormone. Apparently, the use of a large DNA binding domain recognizing 
an extended stretch of DNA with high affinity renders the chimera hormone- and 
15 dimerization-independent. 

To overcome these problems, we produced two types of ER-based 
chimeric regulators, designed to be capable of regulating gene expression through 
a single binding site in a hormone-dependent manner. In the first strategy, a 
heterodimeric regulator was generated consisting of the engineered zinc finger 
protein E2C fused to an ER LBD, as well as an ER LBD fused to a VP64 
activation domain (Fig. 2A). When this heterodimeric regulator was expressed in 
HeLa cells, it had no significant activity on the E2C-TATA-luc reporter plasmid 
in the absence of 4-OHT. Addition of hormone led to a 3- to 5-fold stimulation of 
luciferase expression, indicating the formation of functional heterodimers. 
However, hormone-induced reporter gene activation was significantly lower than 
that induced by an E2C-VP64 fusion protein, presumably at least in part due to 
the formation of E2C-ER and ER-VP64 homodimers. Homodimers were inactive, 
since neither E2C-ER nor ER-VP64 alone induced luciferase expression. In the 
second strategy, fusion proteins were generated by combining the dimerization 
partners E2C-ER and ER-VP64 in one single polypeptide, through a flexible 
polypeptide linker. Two linkers were tested, 18 and 30 amino acids in length, 
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creating the proteins E2C-scER/18-VP64 and E2C-scER/30-VP64 (Fig. 2A). 
These proteins were expected to be activated via intramolecular, rather than 
intennolecular, dimerization and therefore functional as monomers. Combination 
of two ER LBDs into one single-chain fusion construct should allow a more 
5 efficient hormone-induced dimerization and therefore yield more efficient 
activators. Indeed, when E2C-scER/18-VP64 and E2C-scER/30-VP64 were 
transiently expressed in HeLa cells, they efficiently activated the E2C-TATA-luc 
reporter in a largely hormone-dependent manner (Kg. 2B, 2C and 2D). Thus, 
dimeric regulators requiring response elements similar to those of natural steroid 

1 0 hormone receptors were successfully converted into monomelic, ligand- 
dependent transcription factors. 

Monomelic gene-switch based on EcR and RXR LBDs. To show that 
the production of a ligand-dependent monomelic gene switch by fusion with two 
LBDs is a generally applicable strategy, die utility of other nuclear hormone 

15 receptors was tested. In particular, utility of the LBDs of the Drosophila ecdysone 
receptor (EcR) was investigated. In Drosophila, this receptor functions as a 
heterodimer between EcR and the product of the ultraspiracle (USP) gene [Yao, 
T.-P., Forman, B. M., Jiang, Z., Cherbas, L., Chen, J.-D., McKeown, M, Cherbas, 
P., and Evans, R. M. (1993) *fature366, 476-479]. However, it has been shown 

20 that EcR also efficiently heterodimerizes with USP's vertebrate homologue 

retinoid X receptor (RXR) in response to the ecdysone agonists Muristerone A or 
Ponasterone A (PonA) [Nakanishi, K. (1992) Steroids?!, 649-657; Yao, T.-P., 
Foiman, B. M., Jiang, Z., Cherbas, L., Chen, J.-D., McKeown, M., 
Cherbas, P., and Evans, BL M. (1993) *fotun366, 476-479; No, D., Yao, T.-P., 

25 and Evans, R. M. (1996) TroctigtLSksuL ScL US&93, 3346-3351]. The EcR and 
RXR LBDs were therefore used to prepare a monomelic gene switch analogous to 
the scER chimeras described above (Fig. 3 A). Thus, the human RXRaLBD 
(aa373-654) and the Drosophila EcR LBD (aa202-462) were inserted in between 
the E2C DBD and the VP64 activation domain, creating E2C-RE-VF64. In this 

30 fusion construct, the two LBDs are connected by an 18 amino acid flexible linker, 
the same that was used in E2C-scER/l 8-VP64. When this chimeric regulator was 
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transiently expressed in HeLa cells along with the E2C-TATA-luc reporter 
plasmid, significant basal activity was observed. However, activity could be 
increased 3-fold by Pon A, showing that this artificial construct was hormone- 
responsive. To improve the ligand dependence, the length of the linker connecting 
5 the RXR and EcR LBDs was increased, a measure that seemed beneficial in the 
case of the single-chain ER constructs. A longer linker should allow the LBDs to 
optimize their contact and add to the conformational disorder in the unliganded 
state. Indeed, when the linker was elongated to 30 aa (in E2C-RLE-VP64) or 36 
aa (in E2C-RLLE-VP64), basal activity was significantly reduced and PonA led 

10 to a 9- to 10-fold activation, an extent of responsiveness comparable to the one of 
the single-chain ERfasion constructs (Fig. 3B). Thus, serial connection of pairs 
of nuclear hormone receptor LBDs appears to be a generally applicable strategy to 
render monomelic DNA binding proteins ligand-dependent 

The hPR and mER LBDs used for the fusion proteins did not encompass 

1 5 their natural S V40-like nuclear localization signals (NLS), located between amino 
acids 637 and 644 in hPR, and between amino acids 260 and 267 in mER 
[Carson-Jurica, M. A., Schrader, W. T., and OMalley, W. (1990) Tndocrim 
Uteviezw 11, 201-220]. While it has been shown that this NLS is not required for 
hormone-dependent nuclear localization of hPR, regulation of the subcellular 

20 localization of steroid receptors appears to be complex, and it was not a priori 
clear whether the presence of the SV40-like NLS was required for proper function 
of the chimeric proteins. Thus, additional constructs were prepared that 
incorporated an SV40 NLS (PKKKRKV) (SEQ ID NO: 30) in single letter amino 
acid code), either between VP16 and C7, or between C7 and LBD. 

25 The chimeric transcriptional regulators were then tested for their ability to 

regulate the 10xC7-TATA-luc reporter plasmid in a hormone dependent manner. 
10xC7-TATA-luc contains ten C7 binding sites [5'-GCG TGG GCG-31 spaced by 
5 nucleotides, and a TATA box, upstream of the firefly luciferase coding region. 
Each of the fusion proteins upregulated expression of luciferase in a largely 

30 hormone dependent manner. RU486 stimulated the activity of VP16-C7-PR 26- 
fold, while 4-OHT led to a 43-fold activation of VP16-C7-ER. There was no 
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detectable crossreactivity between RU486 and ER, or between 4-OHT and PR. 
The presence of a NLS in either position was not only not required, but even 
undesirable, since it led to an increased basal (i.e. hormone-independent) activity 
of the fiision constructs, presumably through increased nuclear localization. Thus, 
5 the hPR (aa645-914) and mER (aa281-599, G525R) LBDs are able to confer 
hormone-dependence onto the zinc finger protein C7. 

The ability to reversibly control the expression of multiple genes, or 
alleles of a gene, could prove very useful for many basic research applications, hi 
particular, selective and independent expression of one gene, but not another (and 

1 0 vice versa), by small and nontoxic ligands would allow for a comparative analysis 
of gene Junction, both in vitro and in vivo. We have shown that our modular 
system for controlling target gene expression is indeed able to independently 
control the expression of two genes within the same transfected cell, as evidenced 
by RU486-dependent luciferase induction and 4-OHT-induced p-gal expression. 

1 5 The lack of p-gal induction by RU486, and luciferase induction by 4-OHT 
convincingly demonstrates the specificity of the chimeric regulators described 
here. Not only is the exquisite specificity of the utilized DNA binding domains 
retained, but also there is no detectable crossreaction between RU486 and the ER 
LBD, or between 4-OHT and the PR LBD. 
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WHAT IS CLAIMED IS : 

1 . A non-naturally occurring polypeptide comprising two ligand 
binding domains derived from nuclear hormone receptors operatively linked to a 

5 first functional domain. 

2. The polypeptide of claim 1 wherein the two ligand binding 
domains are covalently linked by means of a peptide linker. 

10 3. The polypeptide of claim 2 wherein the linker contains from about 

10 to about 40 amino acid residues. 

4. The polypeptide of claim 2 wherein the linker contains from about 
15 to about 35 amino acid residues. 

15 

5. The polypeptide of claim 2 wherein the linker contains from about 
1 8 to about 30 amino acid residues. 

6. The polypeptide of claim 1 wherein the first and second ligand 
20 binding domains are derived from different nuclear hormone receptors. 

7. The polypeptide of claim 1 wherein the first and second binding 
domains are derived from the same nuclear hormone receptor. 

25 8. The polypeptide of claim 1 wherein the nuclear hormone receptor 

is an estrogen receptor, a progesterone receptor, an ecdysone receptor or a retinoid 
X acid receptor. 

9. The polypeptide of claim 7 wherein at least one of the ligand 
30 binding domains is derived from a retinoid X acid receptor. 



43 



WO 02/06463 



PCT/EP01/08190 



10. The polypeptide of claim 1 wherein the first functional domain is a 
DNA binding domain. 

1 1 . The polypeptide of claim 1 0 wherein the DNA binding domain 
5 comprises at least one zinc finger DNA binding motif. 

12. The polypeptide of claim 1 1 that comprises from two to twelve 
zinc finger DNA binding motifs. 

10 13. The polypeptide of claim 1 1 that comprises from two to six zinc 

finger binding motifs, 

14. The polypeptide of claim 1 1 wherein the zinc finger DNA binding 
motife specifically bind to a nucleotide sequence of the formula (GNN)i_6, where 

15 G is guanidine and N is any nucleotide. 

15. The polypeptide of claim 1 wherein the first functional domain is a 
transcriptional regulating domain. 

20 16. The polypeptide of claim 1 further comprising a second functional 

domain operatively linked to either one of the ligand binding domains or the first 
functional domain. 

17. The polypeptide of claim 16 wherein the first functional domain is 
25 a DNA binding domain and the second functional domain is a transcriptional 

regulating domain. 

18. The polypeptide of claiml7 wherein the DNA binding domain 
comprises at least one zinc finger DNA binding motif. 

30 

19. The polypeptide of claim 1 8 that comprises from two to twelve 
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zinc finger DN A binding motifs. 

20. The polypeptide of claim 1 8 that comprises from two to six zinc 
finger DNA binding motifs. 

5 

21 . The polypeptide of claim 1 8 wherein the zinc finger DNA binding 
motifs specifically bind to a nucleotide sequence of the formula (GNN)i^, where 
G is guanidine and N is any nucleotide. 

10 22. The polypeptide of claim 17 wherein the transcriptional regulating 

domain is an activation domain. 

23. The polypeptide of claiml7 wherein the transcriptional regulating 
domain is a repression domain. 

15 

24. A non-naturally occurring polypeptide comprising (a) a DNA 
binding domain having from two to six zinc finger DNA binding motife; (b) a first 
ligand binding domain derived from a retinoid X receptor operatively linked to 
the DNA binding domain, a second ligand binding domain derived from an 

20 ecdyzone receptor operatively linked to the first ligand binding domain with a 
peptide spacer of from 1 8 to 36 amino acid residues; and (c) a transcriptional 
regulating domain operatively linked to the second ligand binding domain. 

25 . A non-naturally occurring polypeptide comprising (a) a DNA 
25 binding domain having from three to six zinc finger DNA binding motife; (b) a 

first ligand binding domain derived from a progesterone receptor operatively 
linked to the DNA binding domain, a second ligand binding domain derived from 
a progesterone receptor linked to the first ligand binding domain with a peptide 
spacer of from 1 8 to 36 amino acid residues; and (c) a transcriptional regulating 
30 domain operatively linked to the second ligand binding domain. 
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10 



26 . A polynucleotide that encodes the polypeptide of claim 1 . 

27 . A polynucleotide that encodes the polypeptide of claim 1 7 . 

28 . An expression vector comprising the polynucleotide of claim 26. 

29 . An expression vector comprising the polynucleotide of claim 27. 

30. A cell containing the polynucleotide of claim 26, 

31 . A cell containing the polynucleotide of claim 27. 

32. A host cell transformed with the expression vector of claim 28, 
15 33. A host cell transformed with the expression vector of claim 29. 

34. A process of regulating the function of a target nucleotide that 
contains a defined sequence, the process comprising exposing the target 
nucleotide to the polypeptide of claim 1 in the presence of a ligand that binds one 

20 of the ligand binding domains of the polypeptide, wherein the DNA binding 
domain of the polypeptide binds the defined sequence. 

35 . A process of regulating the function of a target nucleotide that 
contains a defined sequence, the process comprising exposing the target 

25 nucleotide to the polypeptide of claim 17 in the presence of a ligand that binds 
one of the ligand binding domain of the polypeptide, wherein the functional 
domain of the polypeptide alters the function of the target nucleotide. 
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m 5 GCC CAG GCG GCC 
ggt: CGG GTC CGC CGG 



20 30 40 50 60 

CTC GAG CCC GGG GAG AAQ CCC TAT GCT TGT CCG GAA TOT GGT AAG 
GAG CTC GGG CCC CTC TTC GGG ATA CGA ACA GGC CTT ACA CCA TTC 
LSPOBKPYACPB CGK> 



70 80 90 100 HO 120 

TCC TTC AGT CGC AGC GAT GTG CTG GTG CGC CAC CAG CGT ACC CAC ACG GGT GAA AAA CCG 
ASG AAG TCA GCG TCG CPA CAC GAC CAC GCG GTG GTC GCA TGG GTG TGC CCA CTT TTT GGC 
SFSRSD VLVRHQRTHTGBKP> 

130 140 150 160 170 180 

TAT AAA TGC CCA GAG TGC GGC AAA TCT TTT AGC CGC AGC GAT GAT CTG GTT CGC CAT CAA 
ATA TTT ACG. GGT CTC ACG CCG TTT AGA AAA TCG GCG TCG CTA CTA GAC CAA GCG GTA GTT 
¥KCPBCGKSFS RSDDtiVRHQ> 

190 200 210 220 230 240 

CGC ACT CAT ACT GGC GAG AAG CCA TAC AAA TGT CCA GAA TGT GGC AAG TCT TTC TCC CAG 
GCG TGA GTA TGA CCG CTC TTC GGT ATG TTT ACA GGT CTT ACA CCG TTC AGA AAG AGG GTC 
RTH TGEKPYKC PSCGKSFSQ> 

250 260 270 280 290 300 

TCT AGC CAC CTG GTT CGC CAC CAA CGT ACT CAC ACC GGG GAG AAG CCC TAT GCT TGT CCG 
AGA TCG GTG GAC CAA GCG GTG GTT GCA TGA GTG TGG CCC CTC TTC GGG ATA CGA ACA. .v., 
SSH L VRHQRTHTGBKPYAC 

310 320 330 340 350 360 

GAA TGT GCT AAG TCC TTC AGC CGC AGC GAT AAC CTG GTG CGC CAC CAG CGT ACC CAC ACG 
CTT ACA CCA TTC AGG AAG TCG GCG TCG CTA TTG GAC CAC GCG GTG GTC GCA TGG GTG TGC 
ECGKSFSRSDNLVRHQE TST> 

370 380 390 400 410 420 

GGT GAA AAA CCG TAT AAA TGC CCA GAG TGC GGC AAA TCT TTT AGC CAG GCC GGC CAC CTG 
CCA CTT TTT GGC ATA TTT ACG GGT CTC ACG CCG TTT AGA AAA TCG GTC CGG CCG GTG GAC 
GEKPTRCPBCGKSFSQAGHL> 

430 440 450 460 470 480 

GCC AGC CAT CAA CGC ACT CAT ACT GGC GAG AAG CCA TAC AAA TGT CCA GAA TGT GGC AAG 
CGG TCG GTA GTT GCG TGA GTA TGA CCG CTC TTC GGT ATG TTT ACA GGT CTT ACA CCG TTC 
,ASHQRTHTGEKPYXCPECGK> 



490 500 510 520 530 540 

TCT TTC AGT GAT TGT CGT GAT CTT GCG AGG CAC CAA CGT ACT CAC ACC GGT AAA AAA ACT 
AGA AAG TCA CTA ACA GCA CTA GAA CGC TCC GTG GTT GCA TGA GTG TGG CCA TTT TTT TGA 
S F S D C RD LARHQ RT HTG KK T> 

550 



AGT GGC CAG GCC GGC 
TCA ICCG GTC CGG CCS 
S G Q A G X> 
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10 



NN 3 GCC CAG GCG GCC 
UK Z CGG GTC CGC CGG 



20 30 40 SO 60 

CTC GAG CCC TAT GCT TGC CCT GTC GAG TCC TGC OAS CGC CGC TTT 
GAG CTC GGG ATA CGA ACG GGA CAG CTC AGG ACG CPA GCG GCG AAA 
LEBYACPVSSCDR RF> 



70 80 90 100 110 120 

TCT AAG TCG GCT GAT CTG AAG CGC CAT ATC CGC ATC CAC ACA GGC CAG AAG CCC TTC CAG 
AGA TTC AGC CGA CTA GAC TTC GCG GTA TAG GCG TAG GTG TGT CCG GTC TTC GGG AAG GTC 
SKSADLKRHIR. IHTGQKPFQ> 

130 140 ISO 160 170 180 

TGT CGA ATA TGC ATG CGT AAC TTC ACT CGT AGT GAC CAC CTT ACC ACC CAC ATC CGC ACC 
ACA GCT TAT ACG TAG GCA TTG AAG TCA GCA TCA CTG GTG GAA TGG TGG GTG TAG GCG TGG 
C R I CMRNFSRSDHLTTHIR T> 

190 200 210 ,220 230 240 

CAC ACA GGC GAG AAG CCT TTT GCC TGT GAC ATT TGT GGG AGG AAG TTT GCC AGG AGT GAT 
GTG TGT CCG CTC TTC GGA AAA CGG ACA CTG TAA ACA CCC TCC TTC AAA CGG TCC TCA CTA 
HTGERPFAC DXCGRKFARS D> 

250 260 270 280 290 300 

GAA CGC AAG AGG CAT ACC AAA ATC CAT ACC GGT GAG AAG CCC TAT GCT TGC CCT GTC GAG 
CTT GCG TTC TCC GTA TGG TTT TAG GTA TGG CCA CTC TTC GGG ATA CGA ACG GGA CAG CTC 
ERRRSTR XHTGEK PYACPV S> 

310 320 330 340 * 330 360 

TCC TGC GAT CGC CGC TTT TCT AAG* TCG GCT GAT CTG AAG CGC CAT ATC CGC ATC CAC ACA 
AGG ACG CTA GCG GCG AAA AGA TTC AGC CGA CTA GAC TTC GCG GTA TAG GCG TAG GTG TGT 
S C D R RF 3 K S A 0 I* K R3 I R X H T> 

370 380 390 400 410 420 

GGC CAG AAG CCC TTC CAG TGT CGA ATA TGC ATG CGT AAC TTC AGT CGT AGT GAC CAC CTT 
CCG GTC TTC GGG AAG GTC ACA GCT TAT ACG TAG GCA TTG AAG TCA GCA TCA CTG GTG GAA 
G Q K P F Q C R I C H R N F S R S 0 H L> 

430 440 450 460 470 480 

ACC ACC CAC ATC CGC ACC CAC ACA GGC GAG AAG CCT TTT GCC TGT GAC ATT TGT GGG AGG 
TGG TGG GTG TAG GCG TGG GTG TGT CCG CTC TTC GGA AAA CGG ACA CTG TAA ACA CCC TCC 
TT H X RT H TGER P FAC D ICG R> 

490 500 510 S20 530 540 

AAG TTT GCC AGG AGT GAT GAA CGC AAG AGG CAT ACC AAA ATC CAT TTA AGA CAG AAG GAC 
TTC AAA CGG TCC TCA CTA CTT GCG TTC TCC GTA TGG TTT TAG GTA AAT TCT GTC TTC CTG 
ICF ARSDBRK RHT!CXHX*RQKD> 



550 

TCT AGA ACT AGT 
AGA TCT TGA TCA 
5 R T S 



360 



GGC CAG GCC GGC 
CCG GTC CGG CCG 
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10 



NP 3 GCC CAG GCG GCC 
NK : CGG GTC QSC COG 



20 ,30 40 50 60 

CTC GAG CCC GGG GAG AAG CCC TOT GCT TOT CCG GAA JTGT GGT AAG 
GAG CTC GGG CCC CTC TTC GGG ASA CGA ACA GGC CTT ACA CCA TTC 
I*BPGBKPYACPBCGK> 



70 80 90 100 110 120 

TCC TTC AGC ACC AGT GGC CAC CTG GTG CGC CAC CAG CGT ACC CAC ACG GGT OVA AAA CCG 
AGG AAG TCG TGG TCA CCG GTG GAC CAC GCG GTG GTC GCA TGG GTG TGC CCA CTT TTT GGC 
SFS TSGHI.VRH QRTHTG 



K 



130 



1^0 ISO ISO 170 180 

TAT AAA TGC CCA GAG TGC GGC AAA TCT TTT ACT CGC AGC GAT GTG CTG GTG CGC CAT CAA 
ATA TTT ACG GGT CTC ACG CCG TTT AGA AAA TCA GCG TCG CTA CAC GAC CAC GCG GTA GTT 
YKCPBCGKSFSRSDVt»VRHQ> 

190 200 210 220 230 240 

CGC ACT CAT ACT GGC GAG AAG CCA TAC AAA TGT CCA GAA TGT GGC AAG TCT TTC TCA CGT 
GCG TGA GTA TGA CCG CTC TTC GGT ATG TTT ACA GGT CTT ACA CCG TTC AGA AAG AGT GCA 
RTHTGBKPYKCPBCG2CSPSR> 

250 260 270 280 290 300 

TCA GAC GAC TTG GTC CGT CAC CAA CGT ACT CAC ACC GGG GAG AAG CCC TAT GCT TGT CCG 
AGT CTG CTG AAC CAG GCA GTG GTT GCA TGA GTG TGG CCC CTC TTC GGG ATA CGA ACA GGC 
s D DLVRHQ RTHTGSK PYAC P> 

310 320 330 340 350 360 

GAA TGT GGT AAG TCC TTC AGT GAT CCT GGC AAC CTG GTT CGC CAC CAG CGT ACC CAC ACG 
CTT ACA CCA TTC AGG AAG TCA CTA GGA CCG TTG GAC CAA GCG GTG GTC GCA TGG GTG'TGC 
BCGKSFSDPGKTLVRKQRTHT> 

370 380 390 400 410 420 

GGT GAA AAA CCG TAT AAA TGC CCA GAG TGC GGC AAA TCT TTT AGT CGC TCC GAT AAA CTG 
CCA CTT TTT GGC ATA TTT ACG GGT CTC ACG CCG TTT AGA AAA TCA GCG AGG CTA TTT GAC 
GB KPYKCPECG KSF SRSDKL> 

430 440 450 460 470 480 

GTG CGC CAT CAA CGC ACT CAT ACT GGC GAG AAG CCA TAC AAA TGT CCA GAA TGT GGC AAG 
CAC GCG GTA GTT GCG TGA GTA TGA CCG CTC TTC GGT ATG TTT ACA GGTJ CTT ACA CCG TTC 
H Q R T H T G B K P ¥ X C P E C G K> 



R 



490 500 510 520 S30 540 

TCT TTC TCC CAG TCT AGC CAC CTG GTT CGC CAC CAA CGT ACT CAC ACC GGT AAA AAA ACT 
AGA AAG AGG GTC AGA TCG GTG GAC CAA GCG GTG GTT GCA TGA GTG TGG CCA TTT TTT TGA 
sp SQSSHLVRHQRTKTG 



K 



T> 



550 



AGT 
TCA 
S 



GGC CAG GCC GGC ChM 
CCG GTC CGG CCG Gft N 
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TGGACTCCQCTCaGC^^ 



FIG. 7 



AGCGCG3tfK3CCG^^ 

CATCAACATCQ\CATGCXGCT 

XIVCGCCAjGCATGCrGCCAmrTTAACTAAC 
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10 

GGA TCC GCC ACC 
CCP A66 CGG TOG 
6 S A T 



20 



AOG 
K 



GCC CM GCG GCC 
CSS GTC COC CGG 



30 40 SO 60 

CTC GAG CCC GGG GAG AAG CCC TAT GCT TGT GCG 
GAG CTC GGG CCC CTC TTC GGG ATA CGA ACA GGC 

LBPGEKPYACP> 



70 SO 50 100 110 120 

GAA TGT GGT AAG TCC TTC ACT AGG AAG GAT TCG CTT GTG AGG CAC CAG CGT ACC CAC ACG 
CTT ACA CCA TTC AGG AAG TCA TCC TTC CTA AGC GAA CAC TCC GTG OTC GCA TGG GTG TGC 
BCGKSFSRKDSliVRHQRTET> 

130 140 150 ISO 170 180 

GGT GAA AAA CCG TAT AAA TGC CCA GAG TGC GGC AAA TCT TTT AGT CAG TCG GGG GAT CTT 
.CCA CTT TTT GGC ATA TTT ACG GGT CTC ACG CCG TTT AGA AAA TCA GTC AGC CCC CTA GAA 
GEKPYKCPBCGKSFSQSGDL> 

190 200 210 220 230 240 

AGG CGT CAI CAA CGC ACT CAT ACT GGC GAG AAG CCA TAG AAA TGT CCA GAA TGT GGC AAG 
TCC GCA GTA GTT GCG TGA GTA TGA CCG CTC TTC GGT ATG TTT ACA GGT CTT ACA CCG TTC 
R R H Q R T BTGSKPYKC F S C G fi> 

230 260 270 230 290 300 

TCT TTC AGT GAT TGT CGT GAT CTT GCG AGG CAC CAA CGT ACT CAC ACC GGG GAG AAG CCC 
AGA AAG TCA CTA ACA GCA CTA GAA CGC TCC GTG GTT GCA TGA GTG TGG CCC CTC TTC GGG 
S PSD CR D LARHQRT HTGSK P> 

310 320 330 340 350 360 

TAT GCT TGT CCG GAA TGT GGT AAG TCC TTC TCT CAG AGC TCT CAC CTG GTG CGC CAC CAG 
ATA CGA ACA GGC CTT ACA CCA TTC AGG AAG AGA GTC TCG AGA GTG GAC CAC GCG GTG GTC 
YACPBCGK SFSQSSHLVRKQ> 

370 380 390 400 410 420 

CGT ACC CAC ACG GGT GAA AAA CCG TAT AAA TGC CCA GAG TGC GGC AAA TCT TTT AGT GAC 
GCA TGG GTG TGC CCA CTT TTT GGC ATA TTT AOS GGT CTC ACG CCG TTT AGA AAA TCA CTG 
RTHTGBKPYKCP BCGKSFSD> 

430 440 450 460 470 480 

TGC CGC GAC CTT GCT CGC CAT CAA CGC ACT CAT ACT GGC GAG AAG CCA TAC AAA TGT CCA 
ACG GCG CTG GAA CGA GCG GTA GTT GCG TGA GTA TGA CCG CTC TTC GGT ATG TTT ACA GGT 
C RDLARHQRTH TOBKPYKCP> 

490 500 510 S20 530 540 

GAA TGT GGC AAG TCT TTC AGC CGC TCT GAC AAG CTG GTG CGT CAC CAA CGT ACT CAC ACC 
CTT ACA CCG TTC AGA AAG TCG GCG AGA CTG TTC GAC CAC GCA GTG GTT GCA TGA GTG TGG 
B ' C G K S F S R S D K I* V R H.Q RT H T> 



550 

GGT AAA AAA ACT AGT 
CCA TTT TTT TGA TCA 
G X K T 3 



560 



370 



380 



590 



600 



GGC CAG GCC GGC C 3C CGA AAT GAA ATG GGT GCT TCA GGA GAC ATG 
CCG GTC CGG CCG, GCT TTA CTT TAC CCA CGA AGT CCT CTG TAC- 
\ Q A G R R N 



M 



610 620 630 640 650 660 

AGG GCT GCC AAC CTT TGG CCA AGC CCT CTT GTG ATT AAG CAC ACT AAG AAG AAT AGC CCT 
TOC CGA CGG TTG GAA ACC GGT TCG GGA GAA CAC TAA TTC GTG TGA TTC TTC TTA TCG GGA 
RAANLWPSPI»VIKHTK1CNSP> 

670 680 690 700 710 720 

GCC TTG TCC TTG ACA GCT GAC CAG ATG GTC AGT GCC TTG TTG GAJT GCT GAA CCG CCC ATG 
CGG AAC AGG AAC TGT CGA CTG GTC TAC CAG TCA CGG AAC AAC CTA CGA CTT GGC GGG TAC 
AIiSLTADQHVSAL ldaep ph> 

730 740 750 760 770 780 

ATC TAT TCT GAA TAT GAT CCT TCT AGA CCC TTC AGT GAA GCC TCA ATG ATG GGC TEA.. TTG 
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TAG ASA AGA CTT ATA CTA GGA AGA TCT GGG AAG TCA CTT CGG ACT TAC TAC CCG AAT AAC 
rYSBTOPSa P FSBASKMOt»t> 

790 800 810 820 830 840 

ACC AAC CTA GCA GAS AGO GAG CTQ CTT CAT ATG ATC AAC TGG GCA AAC AGA GTC3 CCA GGC 
TGG TTG GAT CGT CTA, TCC CTC GAC CAA OTA TAC TAG TIG ACC COT TTC TCT CAC GOT CCG 
TWLADRBLVHK rNWA,KRV *G> 

850 860 870 880 890 900 

TTT GGG GAC TTG AAT CTC CAT GAT CAG GTC CAC CTT CTC GAG TGT GCC TGG CTG GAG ATT 
AAA CCC CTG AAC TTA GAG GTA CTA GTC CAG GTO GAA GAG CTC ACA CGG ACC GAC CTC TAIL 
?GDLNt»HDQVH IitBCAWLBI> 

910 920 930 940 950 950 

CTG ATG ATT GGT CTC GTC TGG CGC TCC ATG GAA CAC CCG GGG AAG CTC CTG TTT GCT CCT 
GAC TAC TAA CCA GAG CAG ACC GCG AGG TAC CTT GTG GGC CCC TTC GAG GAC AAA CGA GGA 
LKZG Zt VWR 5MBHPGRI»LFAP> 

970 980 990 X000 1010 1020 

AAC TTG CTC CTG GAC AGG AAT CAA GGT AAA TGT GTG GAA GGC ATG GTG GAG ATC TTT GAC 
TTG AAC GAG GAC CTG TCC TTA CTT CCA TTT ACA CAC CTT CCG TAC CAC CTC TAG AAA CTG 
ffLLI»D'£l*QGKCVEGMVSX7D> 

1030 1040 1050 1060 1070 1080 

ATG TTG CTT GCT ACG TCA ACT CGG TTC CGC ATG ATG AAC CTG CAG GGT GAA GAG TTT GTG 
TAC AAC GAA CGA TGC AGT TCA GCC AAG GCG TAC TAC TTC GAC GTC CCA CTT CTC AAA CAC 
MLLATSSRFRMKMLQGBBPW 

1090 UCO 1110 U20 1130 1140 

TGC CTC AAA TCC ATC ATT TTG CTT AAT TCC* GGA GTG TAC ACG TOP CTG TCC AGC ACC TTG 
ACG GAG TTT AGG TAG TAA AAC GAA TTA AGG CCT CAC ATG TGC AAA GAC AGG TCG TGG AAC 
CL-KSI ri»LHSGVYTPL33TL> 

1150 1160 1170 118C 1190 1200 

AAG TCT CTG GAA GAG AAG GAC CAC ATC CAC CGT GTC CTG GAC AAG ATC ACA GAC ACT TTG 
TTC AGA GAC CTT CTC TTC CTG GTG TAG GTG GCA CAG GAC CTG TTC TAG TCT CTG TGA AAC 
KSLBBKDHrHRVCDK ITDTL> 

1210 1220 1230 1240 1250 1260 

ATC CAC CTG ATG GCC AAA GCT GGC CTG ACT CTG CAG CAG CAG CAT CGC CGC CTA GCT CAG 
TAG GTG GAC TAC CGG TTT CGA CCG GAC TGA GAC GTC GTC GTC CTA GCG GCG GAT CGA CTC 
IH LMAKAGLTLQQQHRRLAQS. 

1270 1280 1290 1300 1310 1320 

CTC CTT CTC ATT CTT TCC CAT ATC CGG CAC ATG ACT AAC AAA GGC ATG GAG CAT CTC TAC 
GAG GAA GAG TAA GAA AGG GTA TAG GCC GTG TAC TCA TTG TTT CCG TAC CTC CTA GAG ATG 
fcLLXLSBXRBXSSXOlCgBL?*' 

1330 1340 1350 1360 1370 1380 

AAC ATG AAA TGC AAG AAC GTT GTG CCC CTC TAT GAC CTG CTC CTG GAG ATG TTG GAT GCC 
TTG TAC TTT ACG TTC TTG CAA CAC GGG GAG ATA CTG GAC GAG GAC CTC TAC AAC CTA CGG 
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